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Preface 


Owing to the rapid advances in the physical sciences and engineering, the de- 
mand for higher-level mathematics is increasing yearly. This book is designed 
for advanced undergraduates and graduate students who are interested in the 
mathematical aspects of their own fields of study. The reader is assumed to 
have a knowledge of undergraduate-level calculus and linear algebra. 

There are any number of books available on mathematics for physics and 
engineering but they all fall into one of two categories: the one emphasizes 
mathematical rigor and the exposition of definitions or theorems, whereas the 
other is concerned primarily with applying mathematics to practical prob- 
lems. We believe that neither of these approaches alone is particularly helpful 
to physicists and engineers who want to understand the mathematical back- 
ground of the subjects with which they are concerned. This book is different 
in that it provides a short path to higher mathematics via a combination of 
these approaches. A sizable portion of this book is devoted to theorems and 
definitions with their proofs, and we are convinced that the study of these 
proofs, which range from trivial to difficult, is useful for a grasp of the general 
idea of mathematical logic. Moreover, several problems have been included at 
the end of each section, and complete solutions for all of them are presented 
in the greatest possible detail. We firmly believe that ours is a better peda- 
gogical approach than that found in typical textbooks, where there are many 
well-polished problems but no solutions. 

This book is essentially self-contained and assumes only standard under- 
graduate preparation such as elementary calculus and linear algebra. The 
first half of the book covers the following three topics: real analysis, func- 
tional analysis, and complex analysis, along with the preliminaries and four 
appendixes. Part I focuses on sequences and series of real numbers of real 
functions, with detailed explanations of their convergence properties. We also 
emphasize the concepts of Cauchy sequences and the Cauchy criterion that 
determine the convergence of infinite real sequences. Part II deals with the 
theory of the Hilbert space, which is the most important class of infinite vec- 
tor spaces. The completeness property of Hilbert spaces allows one to develop 
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various types of complex orthonormal polynomials, as described in the mid- 
dle of Part II. An introduction to the Lebesgue integration theory, a subject 
of ever-increasing importance in physics, is also presented. Part III describes 
the theory of complex-valued functions of one complex variable. All relevant 
elements including analytic functions, singularity, residue, continuation, and 
conformal mapping are described in a self-contained manner. A thorough un- 
derstanding of the fundamentals treated is important in order to proceed to 
more advanced branches of mathematical physics. 

In the second half of the volume, the following three specific topics are 
discussed: Fourier analysis, differential equations, and tensor analysis. These 
three are the most important subjects in both engineering and the physical 
sciences, but their rigorous mathematical structures have hardly been covered 
in ordinary textbooks. We know that mathematical rigor is often unnecessary 
for practical use. However, the blind usage of mathematical methods as a tool 
may lead to a lack of understanding of the symbiotic relationship between 
mathematics and the physical sciences. We believe that readers who study 
the mathematical structures underlying these three subjects in detail will ac- 
quire a better understanding of the theoretical backgrounds associated with 
their own fields. Part IV describes the theory of Fourier series, the Fourier 
transform, and the Laplace transform, with a special emphasis on the proofs 
of their convergence properties. A more contemporary subject, the wavelet 
transform, is also described toward the end of Part IV. Part V deals with or- 
dinary and partial differential equations. The existence theorem and stability 
theory for solutions, which serve as the underlying basis for differential equa- 
tions, are described with rigorous proofs. Part VI is devoted to the calculus of 
tensors in terms of both Cartesian and non-Cartesian coordinates, along with 
the essentials of differential geometry. An alternative tensor theory expressed 
in terms of abstract vector spaces is developed toward the end of Part VI. 

The authors hope and trust that this book will serve as an introductory 
guide for the mathematical aspects of the important topics in the physical 
sciences and engineering. 


Sapporo, 
November 2009 


Hiroyuki Shima 
Tsuneyoshi Nakayama 
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Preliminaries 


This chapter provides the basic notation, terminology, and abbreviations that 
we will be using, particularly in real analysis, and is designed to serve a 
reference rather than as a systematic exposition. 


1.1 Basic Notions of a Set 

1.1.1 Set and Element 

A set is a collection of elements (or points) that are definite and separate 
objects. If a is an element of a set S, we write 

a £ S. 


Otherwise, we write 

a ^ S 

to indicate that a does not belong to S. If a set contains no elements, it is 
called an empty set and is designated by 0. 

A set may be defined by listing its elements or by providing a rule that 
determines which elements belong to it. For example, we write 


X = {X!,X2,X 3 ,- ■ ■ ,x n } 

to indicate that A is a set with n elements: X\,X 2 ,- • • x n . When a set con- 
tains a finite (infinite) number of elements, it is called a finite (infinite 
set). 

A set X is said to be a subset of Y if every element in X is also an element 
in Y . This relationship is expressed as 


AC Y. 
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When ICY and F Cl, the two sets have the same elements and are said 
to be equal, which is expressed by 


X = Y. 

But when X C Y and 1/ F, then X is called a proper subset of Y, and 
we use the more specific expression 

X cY. 

The intersection of two sets X and Y, denoted by 

iny 

consists of elements that are contained in both X and Y . The union 

XU Y 

consists of all the elements contained in either X or Y, including those con- 
tained in both X and Y . When the two sets X and Y have no element in 
common (i.e. , when X HY = 0), X and Y are said to be disjoint. 

For two sets A and B, we define their difference by the set 

{x : x £ A, x B} 

and denote it by A\B (see Fig. 1.1). In particular, if A contains all the sets 
under discussion, we say that A is the universal set and A\B is called the 

complementary set or complement of A. 


A 




Fig. 1.1. Left: The difference of two sets A and B. Right : The complementary set 
or complement of B in A 
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1.1.2 Number Sets 

Our abbreviations for fundamental number systems are given by 

N : The set of all positive integers not including zero. 

Z : The set of all integers. 

Q : The set of all rational numbers. 

R : The set of all real numbers. 

C : The set of all complex numbers. 

The symbol R n denotes an n-dimensional Euclidean space (see Sects. 

4.1.3 and 19.2.3). Points in R" are denoted by bold face, say, x; the coordi- 
nates of x are denoted by the ordered n-tuple (xi,X 2 , • ■ ■ , x n ), where € R. 
We also use the extended real number defined by 


R = R U {— oo, oo}. 


1.1.3 Bounds 

The precise terminology for bounds of real number sets follow. Meanwhile we 
assume S' to be a set of real numbers. 


4 Bounds of a set: 

1. A real number b such that x < b for all x € S is called an upper 
bound of S. 

2. A real number a such that x > a for all x € S is called a lower 
bound of S. 


Figure 1.2 illustrates the point. We say that a set S is bounded above or 
bounded below if it has an upper bound or a lower bound, respectively. 
In particular, when a set S is bounded above and below simultaneously, it 
is a bounded set. If a set S is not bounded, then it is said to be an un- 
bounded set. 

It follows from these definitions that if b is an upper bound of S, any 
number greater than b will also be an upper bound of S. Thus it makes sense 
to seek the smallest among such upper bounds. This is also the case for a 
lower bound of S if it is bounded below. In fact, the two extrema bounds, the 
smallest and the largest, are referred to by specific names as follows: 

4 Least upper bound: 

An element b G R is called the least upper bound (abbreviated by 

l.u.b.) or supremum, of S if 
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(i) b is an upper bound of S, and 

(ii) there is no upper bound of S that is smaller than b. 

4 Greatest lower bound: 

An element a £ R is called the greatest lower bound (abbreviated 

by g.l.b.) or infimum, if 

(i) a is a lower bound of S , and 

(ii) there is no lower bound of S that is greater than a. 




s 



a 

a G 

s 

b L 

b 

a 

a G 

s 

b L 

b 

a 

a G 

s 

b L 

b 

a 

a G 


b L 

b 


Fig. 1.2. In all the figures, the points a and b are lower and upper bounds of S, 
respectively. In particular, the point ac is the greatest lower bound, and the is 
the least upper bound 


In symbols, the supremum and infimum of S are denote, respectively, by 

sup S and inf S. 

We must emphasize the fact that the supremum and infimum of the set S 
may or may not belong to S. For instance, the set S = {x : x < 1} has the 
supremum 1, which it does not belong to S. Nevertheless, particularly when 
S is finite, we have 


sup S = max S and inf S = min S, 

where max S' and minS denote the maximum and minimum of S, respec- 
tively, both of which belong to S. 


1.1.4 Interval 

When a set of real numbers is bounded above or below (or both) , it is referred 
to as an interval; there are several classes of intervals as listed below. 



1.1 Basic Notions of a Set 


5 


4 Intervals: Given a real variable x, the set of all values of x such that 

1. a< x <b is a closed interval, denoted by [a, 6]. 

2. a < x < b is a bounded open interval, denoted by (a, b). 

3 . a < x and x < b are unbounded open intervals, denoted by (a, oo) 
and (—oo,6), respectively. 

Sets of points {x} such that 

a < x < 6, a < x < b, a < x, x < b 

may be referred to as semiclosed intervals; see Sect. 1.1.5 for more rigorous 
definitions. Every interval I\ contained in another interval / 2 is a subinterval 
of I 2 . 


1.1.5 Neighborhood and Contact Point 

The following is a preliminary definition that will be significant in the discus- 
sions on continuity and convergence properties of sets and functions. 

4 Neighborhoods: 

Let x £ R. A set VCR is called a neighborhood of x if there is a 
number e > 0 such that 

(x — e, x + e) CV. 


In line with the idea of neighborhoods, we introduce the following important 
concept (see Fig. 1.3): 

Contact points: 

Assume a point x C R and a set SCR. Then x is called a contact 
point of S if and only if every neighborhood of x contains at least one 
point of S. 


I Remark. A contact point of S may or may not belong to S. In contrast, a 
point x £ S is necessarily a contact point of S. 

Obviously, every point of S' is a contact point of S. In particular, when S is a 
single-element set given by S = {xo} with Xq £ R , then Xq is a contact point 
of S since every neighborhood of Xo contains Xq itself. The collection of all 
contact points of a set S is called the closure of S and is denoted by [S] . 
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o 


o 


o 


o 


■o 


o- 


♦ 


S 


o 





X 


\ 


> Case A 




O 


o 






Case B 


— c 1 % * 1 o • Case C 

Fig. 1.3. Case A: a; is a limit point (and thus a contact point) of S. Case B: x is 
not a contact point of S. Case C: x is an isolated point (and thus a contact point) 
of S' 


Contact points can be classified as follows (see again Fig. 1.3): 

4 Limit points: 

A contact point x £ R is called a limit point of the set S C R if and 
only if every neighborhood V of x contains a point of S different from x. 

4 Isolated points: 

A contact point x is called an isolated point of S if and only if x has 
a neighborhood V in which x is the only point belonging to S. 

In plain words, a limit point a; is a point such that every interval (x — s, x + s) 
contains an infinite number of points, regardless of the smallness of e. A 
limit point may be referred to as a cluster point or accumulation point, 
depending on the context. The symbol S is commonly used to denote the set 
of limit points of S. 

Examples 1. If S is the set of rational numbers in the interval [0,1], then 
every point of [0, 1], rational or not, is a limit point of S. 

2. The integer set Z has no limit point; it has an infinite number of isolated 
points. 

3. The origin is the limit point of the set {1/m: m£ JV}. 
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Remark. From the definition, a limit point of a set need not belong to the set. 
For instance, x = 1 is the limit point of the set S = {x : x £ R, x > 1}, but 
it does not belong to S. In contrast, an isolated point of S must lie in S. 

Limit points are further divided into two classes. A limit point a; of a set S 
is called an interior point of S if and only if x has a neighborhood VC S. 
Otherwise, it is called a boundary point of S. Figure 1.4 is a schematic 
illustration of the difference between interior and boundary points. 

5 

» ; 1 c ► * 

at> c d 

Fig. 1.4. All four points are limit points of S. Among them, b and c are interior 
points, whereas a and d are boundary points 


1.1.6 Closed and Open Sets 

Closed and open sets are defined in terms of the concepts of contact points 
and closure. Recall that a closure of S, denoted by [5], is a set of all contact 
points of S , which is a union of the two sets: all limit and all isolated points 
of S. 


4|k Closed sets: 

A set S C R is closed if [A] = S , i.e. , if S' coincides with its own closure. 

4k Open sets: 

A set S C R is open if S consists entirely of its interior points and has 
no boundary points. 


It follows intuitively that a set S C R is open if and only if its complemen- 
tary set is closed. The proof is given in Exercise 4 in this chapter. Note that 
the condition [S] ^ S is inconclusive as to whether S is open or not. 

Examples 1. Every single-element set S = {xo} with Xq £ R is closed since 

[S\ = S. 

2. Every set consisting of a finite number of points is closed. 

3. For any real number x, the set 2?\{a;} is open since {a;} is closed. 

4. The intervals [a, b), [a, oo), and (— oo, b } are all closed, which is proven by 
considering their closures. 

5. The interval [a, 6) is neither closed nor open. In fact, it is not closed since 
it excludes its boundary point b and it is not open since it contains its 
boundary point a. 



8 


1 Preliminaries 


Exercises 

1 . Give the supremum and infimum of each of the following sets: 

(1) S = {x : 0 < x < 5}. 

(2) S = {x : x G Q and x 2 < 2}. 

(3) S = {x : x = 3 + ^ , n G N}. 

Solution: (1) sup 5 = 5, inf S' = 0. (2) sup S = \/2, inf S = 0. 

(3) sup S = 4, inf 5 = 3. X 

2. Suppose S to be any of the intervals: (a, 6), [a, b), (a, 6], or [a, b ]. Show that 

sup S = b, inf S = a. 


Solution: Take S = (a, b). Since x < b for all x G S, b serves 

as one of upper bounds of S. We show that b is surely the least- 
upper bound. To see this, we first assume that u is another upper 
bound of S such that u < b; then a < u < (u + b)/2 < b. This 
implies that 


u + b 
2 


G S 


and 


u < 


u + b 
2 


which contradicts the assumption that u is an upper bound of S. 
Hence, u > b ; i.e., any upper bound other than b must be larger 
than b. We thus conclude that b = sup S. The proof is similar for 
the other three cases. X 


3. Show that the set of integers has no limit point, i.e., Z = 0. 

Solution: Take any x G Z, and let £ = min{|n — x\ : n G Z}. 

The interval {x — e, x+e) contains no integers other than x ; hence, 
x qL Z. Since this is the case for any i G Z, we conclude that Z 
is totally composed of isolated points. X 


4. Show that a set SCR is open if and only if its complementary set R\S 
is closed. 

Solution: If S is open, then every point x G S has a neighborhood 
contained in S. Therefore no point x G S can be a contact point of 
R\S. In other words, if x is a contact point of R\S , then x G R\S , 
i.e., R\S is closed. 

Conversely, if R\S is closed, then any point x G S must have a 
neighborhood contained in S, since otherwise every neighborhood 
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of x would contain points of R\S, i.e., x would be a contact point 
of R\S not in R\S. Therefore S is open. £ 


1.2 Conditional Statements 

Phrases such as if... then..., and ... if and only if ... are frequently used to 
connect simple statements that can be described as either true or false. For 
the sake of typographical convenience, there are conventional logical symbols 
for representing such phrases. 

Suppose P and Q are two different statements. The compound statements 

if P then Q 


and 


P implies Q 


mean that if P is true then Q is true. This is written symbolically as 


P^Q. 


( 1 . 1 ) 


We say that 


P is a sufficient condition for Q 


or 


Q is a necessary condition for P. 

In the above context, P stands for the hypothesis or assumption, and Q is the 
conclusion. 


Remark. To prove the implication (1.1) in actual problems, it suffices to ex- 
clude the possibility that P is true and Q is false. This may be done in one 
of three ways. 

1. Assume that P is true and prove that Q is true (direct proof). 

2. Assume that Q is false and prove that P is false (contrapositive proof). 

3. Assume that P is true and Q is false, and then prove that this leads to a 
contradiction (proof by contradiction). 


When P implies Q and Q implies P , we abbreviate this to 


P 


Q, 
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and we say that 

P is equivalent to Q 

or, more commonly, 

P if and only if Q. 

This also means that P is a necessary and sufficient condition for Q. 
Examples Observe that 

x = 1 => x 1 2 = 1 and x = — 1 => x 2 = 1. 

Conversely, we see that 

x 2 = 1 => x = — 1 or 1. 

Therefore, we conclude that 

a; 2 = 1 <t=>- x & {— 1,1}. 


1.3 Order of Magnitude 

1.3.1 Symbols O, o, and ~ 

We use the notations O, o, and ~ to express orders of magnitude. To explain 
their use, we consider the behavior of functions fix) and g{x) in a neighbor- 
hood of a point Xq- 

1. We write 

fix) = 0(g(x)), x^x 0 
if there exists a positive constant A such that 

1/0*01 < A\g{x)\ 

for all values of x in some neighborhood of xq- 


2. We write 


if 


3. We write 


/ 0*0 = x -» x 0 


lim 

X — >Xq 


fix) 

g{x) 


= o. 


f(x)~g(x), x^x 0 


lim 

X — >Xq 


fix) 

gix) 


= l. 


if 
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In addition to the formal definitions above, we summarize the actual mean- 
ing of these symbols: 

1 . fix ) = 0(g{x)) means that f(x) does not grow faster than g{ x) as x — > xg. 

2. f{x) = oigix )) means that fix) grows more slowly than g{x) as x > Xo. 

3. fix) ~ g{x) means that f{x) and g{x) grow at the same rate as x — > Xg. 

We occasionally employ the symbols 


fix) = 0(1) as x — > xo- 

This simply means that fix) is bounded on the order of 1. The symbol 

fix) = o(l) as x — > Xg 
means that fix) approaches zero as x — * Xo. 

Examples The relations 1 3 below hold for x — > oo. 


1 . 

2 . 

3. 

The 

4. 


1 


1 + x 2 
1 


= °>E2 


1 + X 2 


1 + X 2 X 2 




\j x 2 + 1 = x + O ( — 


, \A 2 + 1 = x + o(l), \Jx 2 + l 


following hold for x — » 0: 
sinx = 0(l), sinx~x, cos x = 1 + O (x 2 ) . 


x. 


1.3.2 Asymptotic Behavior 

Asymptotic behavior of /(x) as x — > a can be quantified by using the powers 
of (x — a) as comparison functions. As an example, suppose that a function 
fix) satisfies the relation 

fix) = O ((x — a) p ) for x — » a (1.2) 

for some real number p = pg. Then, the relation (1.2) clearly holds for all p 
for p < pg, and it may or may not hold for some p if p > pg. Thus we can 
define the supremum of such p’s that satisfy (1.2), and denote it by q, i.e. , 

q = sup{p | fix) = O ((x - «) p )}. (1.3) 

In this case, we say that / vanishes at x = a to order q. The quantity q 
defined by (1.3) is useful for describing the asymptotic behavior of /(x) in the 
vicinity of x = a. 
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Remark. Note that (1.3) itself does not imply that 

f(x) = O (( x — a ) q ) , x — > a. 

For instance, the function f(x) = log x defined within the interval (0, 1) yields 
<7 = 0, since for x — » 0, 


log a; 


= O ( x p ) p < 0, 
^ O ( x p ) p > 0. 


But it is obvious that logo: ^ 0(1). 


1.4 Values of Indeterminate Forms 


1.4.1 l’Hopital’s Rule 


A function f(x) of the form u(x)/v(x) is not defined for x = a if f(a) takes 
the form 0/0. Still, if the limit lim x -> a f(x) exists, then it is often desirable 
to define /(a) = lim x _> a f(x). In such a case, the value of the limit can be 
evaluated by using the following theorem: 


4 l’Hopital’s rule: 

Let u(a) = v (a) = 0. If there exists a neighborhood of x = a such that 
(i) v{x) ^ 0 except for x = a, and 

(ii) u'(x) and v'(x) exist and do not vanish simultaneously, then, 



whenever the limit on the right exists. 


For the proof of the theorem, see Exercise 3 in Sect. 8.1. 


Remark. If u'(x)/v'(x) is itself an indeterminate form, the above method may 
be applied to u'{x)/v'(x) in turn, so that 


lim 


i(x) 

>{x) 


y u'{x) _ u"{ x) 

x™a V '{x) x—>a V "(x) ' 


If necessary, this process may be continued. 
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1.4.2 Several Examples 


In the following, we show several examples of indeterminate forms other 
than the form of 0/0 previously discussed. Often functions /( x) of the forms 
u(x)v( x), [w(a;)] , '( x ), and u(x) — v(x) can be reduced to the form p(x)/q(x) 
with the aid of the following relations: 


u{x)v{x) 


u(x) 

l/v(x) 


v(x ) 
l/u(x ) ’ 


= e 9 ^ x \ where g(x) 


log u(x) 
l/v(x) 


log v(x) 
l/u(x) 


u(x ) — v(x ) 


/tv — -/ty p u { x ) 

3 1 U = log h(x), where h(x) = 

u(x) v(x ) 


After the reduction, the l’Hopital method given in Sect. 1.4.1 becomes appli- 
cable. 



Part I 


Real Analysis 
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Real Sequences and Series 


Abstract In this chapter, we deal with the fundamental properties of sequences and 
series of real numbers. We place particular emphasis on the concept of “convergence,” 
a thorough understanding of which is important for the study of the various branches 
of mathematical physics that we are concerned with subsequent chapters. 


2.1 Sequences of Real Numbers 


2.1.1 Convergence of a Sequence 


This section describes the fundamental definitions and ideas associated with 
sequences of real numbers (called real sequences). We must emphasize 
that the sequence 

(x n : n G N) 


is not the same as the set 

{x n : n G N}. 

In fact, the former is the ordered list of x n , some of which may be repeated, 
whereas the latter is merely the defining range of x n . For instance, the constant 
sequence x n = 1 is denoted by (1, 1, 1, • • • ), whereas the set {1} contains only 
one element. 

We start with a precise definition of the convergence of a real sequence, 
which is an initial and crucial step for various branches of mathematics. 


4 Convergence of a real sequence: 

A real sequence ( x n ) is said to be convergent if there exists a real 
number x with the following property: For every e > 0, there is an integer 
N such that 

n > N => \x n — x\ < e. (2.1) 
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We must emphasize that the magnitude of e is arbitrary. No matter how 
small an e we choose, it must always be possible to find a number N that will 
increase as e decreases. 

Remark. In the language of neighborhoods, the above definition is stated as 
follows: The sequence (x n ) converges to x if every neighborhood of x contains 
all but a finite number of elements of the sequence. 


When (x n ) is convergent, the number x specified in this definition is called 
a limit of the sequence (x n ), and we say that x n converges to x. This is 
expressed symbolically by writing 


lim x n = x, 

n — »oo 


or simply by 


If (x n ) is not convergent, it is called divergent. 


Remark. The limit x may or may not belong to (x n )] this situation is similar 
to the case of the limit point of a set of real numbers discussed in Sect. 1.1.5. 


An example in which x = lima: n but x ^ x n for any n is given below. 


Examples Suppose that a sequence (x n ) consisting of rational numbers is de- 
fined by 

( x n) = (3-1, 3.14, 3.142, • • • , x n , • • • ), 


where x n € Q is a rational number to n decimal places close to n. Since the 
difference \x n — 7r| is less than 10 _ ", it is possible to find an N for any e > 0 
such that 

n> N => \x n — 7r| < £. 


This means that 


lim x n = 7T. 

n — >-oo 


However, as the limit, 7 r, is an irrational number it is not in Q. 


I Remark. The above example indicates that only a restricted class of convergent 
sequences has a limit in the same sequence. 


2.1.2 Bounded Sequences 

In the remainder of this section, we present several fundamental concepts 
associated with real sequences. We start with the boundedness properties of 
sequences. 
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4 Bounded sequences: 

A real sequence (x n ) is said to be bounded if there is a positive number 
M such that 

\x n | < M for all n G N. 


The following is an important relation between convergence and boundedness 
of a real sequence: 

6 Theorem: 

If a sequence is convergent, then it is bounded. 


Proof Suppose that x n — > x. If we choose e = 1 in (2.1), there exists an integer 
N such that 

\x n — x\ < 1 for all n> N. 

Since \x n \ — \x\ < \x n — x\, it follows that 

\x n \ < 1 + |*| for all n > N. 

Setting M = max{|*i|, |* 2 |, • • • , |*at_i|, 1 + |*|} yields 

\x n \ < M for all n € N, 
which means that (* n ) is bounded, 

Remark. Observe that the converse of the theorem is false. In fact, the sequence 

(1,-1, 1,-1,... ,(-i)V--) 

is divergent, although it is bounded. 


2.1.3 Monotonic Sequences 

Another important concept in connection with real sequences is monotonicity, 
defined as follows: 

4 Monotonic sequences: 

A sequence (* n ) is said to be 

1. increasing (or monotonically increasing) if *„+i > x n for all n G AT, 

2. strictly increasing if x n+ i > x n for all n G IV, 
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3. decreasing (or monotonically decreasing) if x n + 1 < x n for all n £ AT, 
and 

4. strictly decreasing if x n+ i < x n for all n £ N. 


These four kinds of sequences are collectively known as monotonic 
sequences. Note that a sequence ( x n ) is increasing if and only if (—x n ) is 
decreasing. Thus, the properties of monotonic sequences can be fully investi- 
gated by restricting ourselves solely to increasing (or decreasing) sequences. 

Once a sequence assumes monotonic properties, its convergence is deter- 
mined only by its boundedness, as stated below. 

6 Theorem: 

A monotonic sequence is convergent if and only if it is bounded. More 

specifically, 

(i) If ( x n ) is increasing and bounded above, then its limit is given by 

lim x n = sup x n . 

n — »oo 

(ii) If ( x n ) is decreasing and bounded below, then 

lim x n = inf x n . 

n — >oo 


Proof If ( x n ) is convergent, then it must be bounded as proven earlier (see 
Sect. 2.1.2). Now we consider the converses for cases (i) and (ii). 

(i) Assume (x„) is increasing and bounded. The set S = {£„} will then have 
the supremum denoted by sup S = x. By the definition of the supremum, 
for arbitrary small e > 0 there is an Xn £ S such that 


xn > x — e. (2-2) 

Since x n is increasing, we obtain 

x n > xn for all n> N. (2.3) 

Moreover, since x is the supremum of S, we have 

x > x n for all n € N. 

From (2.2), (2.3), and (2.4), we arrive at 

\x n — x\ = x — x n < x — xn < £ for all n > N, 


(2.4) 
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which gives us the desired conclusion, i.e, 

lim x n = x = sup S. 

n— ► oo 

(ii) If (x n ) is decreasing and bounded, then (—x n ) is increasing and bounded. 
Hence, from (i), we have 

lim (—x n ) = sup(— S). 

n — >oo 

Since sup(— S) = — inf S, it follows that 

lim x n = inf S. Jk 

n— >oo 


2.1.4 Limit Superior and Limit Inferior 

We close this section by introducing two specific limits an any bounded 
sequence. Let (x n ) be a bounded sequence and define two sequences (y n ) 
and (z n ) as follows: 

y n = supjxfe : k > n}, (2.5) 

z n = infjxfe : k > n}. 

Note that y n and z n differ, respectively, from sup{a;„} and inf{x n }. It follows 
from (2.5) that 

Vi = sup{x fc : k > 1} > 2/2 = sup{x fc : k > 2} > y 3 ■ ■ ■ , 

which means that the sequence (y n ) is monotonically decreasing and bounded 
below by inf x n . Thus in view of the theorem in Sect. 2.1.3, the sequence ( y n ) 
must be convergent. The limit of (y n ) is called the limit superior or the 
upper limit of (x n ) and is denoted by 

lim sup x n (or lim x n ). 

n—* oo 

Likewise, since (z n ) is increasing and bounded above by sup x n , it possesses 
the limit known as the limit inferior or lower limit of x n denoted by 

lim inf x n (or lim x n ) . 

n—> oo 

In terms of the two specific limits, we can say that a bounded sequence (x n ) 
converges if and only if 

lim x n = lim sup x n = lim inf x n . 

n — »oo n ¥OQ n — »oo 

(A proof will be given in Exercise 4 in Sect. 2.1.4.) Note that by definition, it 
readily follows that 

lim sup x n > lim inf x n , 

n—> oo n — ►oo 

lim sup (—x n ) = — lim inf x n . 
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Examples 1. x n = (— l) n => lim sup x n = 1 , liminf x n = — 1. 

n — »oo n— >oo 

2. x n = (—1)™ 4 — => lim sup x n = 1, liminf x n = — 1. 

Tl n — >oo n >0 ° 

(-i) n (-i) n 

3. X 2 n = 14 , a: 2 n-i = , => limsup*„ = 1, liminf x n = 0. 

Tl Tl yi — » oo n 

4. (* n ) = (2,0, — 2, 2,0, — 2, • • • ) => lim sup x n = 2, liminf x n = —2. 

n — »oo n— ^oo 

The four cases noted above are illustrated schematically in Fig. 2.1. All 
the sequences (*„) are not convergent and thus the limit linin^oo x n does 
not exist. This fact clarifies the crucial difference between lim,,—^ x n and 
limsup„^ (X) x n (or liminf^oo x n ). 


1 

0 

-1 


*2 X 5 

it \ X 3 /" 


■l *1 

x l x 4 



n 




Fig. 2.1. All the sequences of {*„} in the figures do not converge, but they all 
possess lim sup x„ — 1 and lim inf x n = — 1 

n — ^oo n >00 

The limit superior of x n has the following features and similar features are 
found for the limit inferior. 
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4 Theorem: 

1. For any small £ > 0, we can find an N such that 

n > N => x n < lim sup + e. 

n—> oo 


2. For any small e > 0, there are an infinite number of terms of x n such 
that 


lim sup x n — e < x n . 

n — >oo 


Proof 1. Recall that lim sup x n = lim y n , where y n is defined in (2.5). For 

n — »oo n— ^oo 

any £ > 0, there is an integer N such that 

n > N => lim sup x n — e < y n < lim sup x n + e. 

n—> oo n—* oo 

Since y n > x n for all n, we have 

n > N => x n < lim sup x n + e. X 

n—> oo 

2. Suppose that there is an integer m such that 

n > m =>■ lim sup x n — £ > x n . 

n—* oo 

Then for all k > n > m, we have 

Xk < limsupx ri — £, 

n—* oo 

which means that 

y n < lim sup x n — £ for all n > to. 

n—*oo 

In the limit of n — > oo, we find a contradiction such that 
lim sup x n < lim sup x n — £. 

n—> oo n—*o o 

This completes the proof. X 

Exercises 

1. Prove that if the sequence ( x n ) is convergent, then its limit is unique. 
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Solution: Let x = limx n and y = limx n with the assumption 

x ^ y. Then we can find a neighborhood V\ of x and a neigh- 
borhood V 2 of y such that V\ fl V 2 = 0. For example, take 
V\ = (x — e, x + e) and V 2 = (y — e, y + e), where e = \x — y |/2. 
Since x n — > x, all but a finite number of terms of the sequence 
lie in V\. Similarly, since y n — ■> y, all but a finite number of its 
terms also lie in V- 2 - However, these results contradict the fact that 
Vi (~1 V 2 = 0, which means that the limit of a sequence should be 
unique. 

2. If x n — » x 7^ 0, then there is a positive number A and an integer N such 
that n> N => \x n \ > A. Prove it. 

Solution: Let £ = |x|/2, which is a positive number. Hence, there 
is an integer N such that n> N => \x n — x\ < £ =» \ \x n \ — |x|| < £. 
Consequently, \x\ — e < \x n \ < |x| + £ for all n > N. From the 
left-hand inequality, we see that \x n \ > |x|/2, and we can take 
M = |x|/2 to complete the proof. Jl» 


3. Prove that the sequence x n = [1 + (1/n)]” is convergent. 


Solution: The proof is completed by observing that the sequence 
is monotonically increasing and bounded. To see this, we use the 

binomial theorem, which gives 


— / . nC'n—k u 

z — / n K 


fc= 0 


= 1 + 1 +- 1 -- +- 1 -- 1 -- 


2 ! 


-7 1 - - 


n 


3! 


1-2 

n 


1 - 


n — 1 


Likewise we have 
x n +i = l + l+ ^y(l~ 


1 


(n + 1)! 


1 - 


n + 1 
1 


in- 


1 


n + 1 


3! 

1 - 


n + 1 


n + 1 


1 - 
1 - 


n + 1 
n 


n + 1 


Comparing these expressions for x n and x n +i, we see that every 
term in x n is no more than corresponding term in x n +i. In ad- 
dition, x n +i has an extra positive term. We thus conclude that 
Xn+i > x n for all n G AT, which means that the sequence (x n ) is 
monotonically increasing. 

We next prove boundedness. For every n £ AT, we have x n < 
Sfe=o(l/^!)- Using the inequality 2 n ~ 1 < n\ for n > 1 (which can 
be easily seen by taking the logarithm of both sides) , we obtain 
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.A 1 1 - (1/2)” 

< 1 + \ = 14 — — — — 

h 2 i-a/ 2 ) 


< 3. 


Thus (x n ) is bounded above by 3. Thus, view of the theorem in 
Sect. 2.1.3, the sequence is convergent. £ 

4. Denote x = liiri sup x n and x = 1 i in inf x n . Prove that a sequence (x n ) 
converges to x if and only if x = x = x. 

Solution: In view of the theorem in Sect. 2.1.4, it follows that 

(— oo,ai + £) contains all but a finite number of terms of (x n ). The 
same property applied to (—x n ) implies that (x — £, oo) contains 
all but a finite number of such terms. If x = x = x, then (x — 
e, x + e) contains all but a finite number of terms of (x n ). This is 
the assertion that x n — » x. 

Now suppose that x n — ■> x. For any e > 0, there is an integer N 
such that n > N => x n < x + e => y n < £ + £, where y n = supja;*; : 
k > n}, as was introduced in (2.5). Hence, x < x + e. Since £ > 0 
is arbitrary, we obtain x < x. Working with the sequence (—x n ), 
whose limit is —x, following same procedure, we get x>x. Since 
x< x, we conclude that x = x = x. Jk 


2.2 Cauchy Criterion for Real Sequences 

2.2.1 Cauchy Sequence 

To test the convergence of a general (nonmonotonic) real sequence, we have 
thus far only the original definition given in Sect. 2.1.1 to rely on; in that 
case we must first have a candidate for the limit of the sequence in question 
before we can examine its convergence. Needless to say, it is more convenient 
if we can determine the convergence property of a sequence without having to 
guess its limit. This is achieved by applying the so-called Cauchy criterion, 
which plays a central role in developing the fundamentals of real analysis. 

To begin with, we present a preliminary notion for subsequent discussions. 

<|k Cauchy sequence: 

The sequence (x n ) is called a Cauchy sequence (or fundamental 
sequence) if for every positive number £, there is a positive integer N 
such that 

m,n>N =$■ \x n — x m \<£. (2.6) 

This means that in every Cauchy sequence, the terms can be as close to one 
another as we like. This feature of Cauchy sequences is expected to hold for 
any convergent sequence, since the terms of a convergent sequence have to 
approach each other as they approach a common limit. This conjecture is 
ensured in part by the following theorem. 
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4 Theorem: 

If a sequence (x n ) is convergent, then it is a Cauchy sequence. 


Proof Suppose lim:r„ = x and e is any positive number. From hypothesis, 
there exists a positive integer N such that 

n > N => \x n — x\ < 

Now if we take m,n > N, then 

£ £ 

\x n -x\< - and \x m - x\ < -. 

It thus follows that 

I x n x m | ^ \x m x\ T \x n x\ 

which means that ( x n ) is a Cauchy sequence, £ 

This theorem naturally gives rise to a question as to whether converse 
true. In other words, we would like to know whether all Cauchy sequences are 
convergent or not. The answer is exactly what the Cauchy criterion states, as 
we prove in the next subsection. 

2.2.2 Cauchy Criterion 

The following is one of the fundamental theorems of real sequences. 

4 Cauchy criterion: 

A sequence of real numbers is convergent if and only if it is a Cauchy 
sequence. 


Bear in mind that the validity of this criterion was partly proven by demon- 
strating the previous theorem (see Sect. 2.2.1). Hence, in order to complete 
the proof of the criterion, we need only prove that every Cauchy sequence is 
convergent. The following serves as a lemma for developing the proof. 

6 Bolzano — Weierstrass theorem: 

Every infinite and bounded sequence of real numbers has at least one 
limit point in R. (The proof is given in Appendix A.) 


We are now ready to prove that every Cauchy sequence is convergent. 
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Proof (of the Cauchy criterion): Let (x n ) be a Cauchy sequence 
and S = {x n : n G N}. We consider two cases in turn: (i) the set S 
is finite, and (ii) S is infinite. 

(i) It follows from the hypothesis that given £ > 0, there is an integer 
N such that 

m,n> N => \x n — x m \ < e. (2-7) 

Since S is finite, one of the terms of the sequence (x n ), say x, 
should be repeated infinitely often in order to satisfy (2.7). This 
implies the existence of an m > N such that x m = x. Hence, we 
have 

n > N => \x n — x\ < £, 
which means that x n — » x. 

(ii) Next we consider the case that S is infinite. It can be shown that 
every Cauchy sequence is bounded (see Exercise 1). Hence, in 
view of the Bolzano - Weierstrass theorem, the sequence (x n ) 
necessarily has a limit point x. We shall prove that x n — » x. Given 
£ > 0, there is an integer N such that 

m,n>N => \x n — x m \ < e. 

From the definition of a limit point, we see that the interval ( x — 
£, x + e) contains an infinite number of terms of the sequence ( x n ). 
Hence, there is an m > N such that x m G (x — £, x + £), i.e., such 
that \x n — x m \ < £. Now, if n > N, then 

\x n - x\ < \x n - x m \ + \x m - x\ < £ + £ = 2e, 

which proves x n — > x. 

The results for (i) and (ii) shown above indicate that every Cauchy 
sequence (finite and infinite) is convergent. Recall again that its con- 
verse, every convergent sequence is a Cauchy sequence , was proven ear- 
lier in Sect. 2.2.1. This completes the proof of the Cauchy 
criterion. £ 


Exercises 

1. Show that every Cauchy sequence is bounded. 

Solution: Let (x n ) be a Cauchy sequence. Taking £ = 1, there is 

an integer N such that 

n > N => \x n — :rjv| < 1. 
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Since \x n \ — \xn\ < \x n — xjvj, we have 

n > N => \x n \ < |xjv| + 1. 

Thus \x n \ is bounded by max{|xi|, \x 2 \, ■■ • , |xjv-i|, |xjv| + 1}. 4k 


2 . Let X\ = 1, x 2 = 2, and x n = (x n _i + x n _ 2 )/2 for all n > 3. Show that 
(x n ) is a Cauchy sequence. 

Solution: Since for n > 3, x n — x n _i = — (x n _i — x n _ 2 )/2, we use 
the induction on n to obtain x n —x n+ i = (— 1)™/2™~ 1 for all n £ 

N. Hence, if m > n, then 


| Xn -X m \ < \x n — X n+ i| + |x n+ i - X„ +2 | H h |x m _i - X r 

m — 1 1 1 m—n—1 1 

= v — = — V - 

/ -j o/c — 1 On— 1 / O k 


2 k ~ 

k=n 

1 1 - (1/2) 

9 n— 1 


1 - ( 1 / 2 ) 


k—0 

m—n 

< 


2 n ~ 1 1 — ( 1 / 2 ) 2" -2 


Since 1/2" -2 decreases monotonically with n, it is possible to 
choose N for any e > 0 such that (1/2 JV_2 ) < e. We thus conclude 
that 


m > n > N => 




< 




< e, 


which means that (x n ) is a Cauchy sequence, 4k 


3 . Suppose that the two sequences (x n ) and (y n ) converge to a common limit 
c and consider their shuffled sequence (z n ) defined by 

(zi,z 2 ,z 3 ,z 4 ,---) = (xi,yi,X2,V2,- ■■)■ 

Show that the sequence (z n ) also converges to c. 

Solution: Let £ be any positive number. Since x n — > c and y n — > c, 
there are two positive integers N\ and N 2 such that 

n> N\ => \x n — c\ < e and n > N 2 => \y n — c\ < e. 

Define TV = max{JV-i ,JV 2 }- Since x*, = Z 2 k - 1 and yu = Z 2 k for all 
/c £ IV, we have 

k > N => |xfe - c| = |z 2 fe_i - c| < e and |y fc - c| = |,z 2 k - c\ < e. 
Hence, n > 2N—1 => | 2 „ — c| < e, which just means lim 2 : n = c. 4k 
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4. Show that lim (a n /n k ) — > oo, where a > 1 and k > 0. 

n — »oo 


Solution: We consider three cases in turn: (i) k = 1, (ii) k < 1, 

and (iii) k > 1. 

(i) Let fc = 1. Then set a = 1 + h to obtain 


= (l + h) n = 1 + nh + 


n(n — 1) 




n(n — 1) . 


which results in 


a n /n = (1 + h) n /n > (n — l)h n /2 — > oo. (?r — » oo). 


(ii) The case of k < 1 is trivial since a n /n k > a n /n for any n > 1. 

(iii) If k > 1, then a}^ k > 1 since a > 1. Hence, it follows from 
the result of (i) that for any M > 1, we can find an n so that 
n> M => a}/ k /n > M. This means that 

o^ = r (gV fc )" 

n k n 

which implies that a n /n k — > oo. A 

5. Let a: n = a n /n\ with a > 0. Show that the sequence (x n ) converges to 0. 


> M k > M, 


Solution: Let k be a positive integer such that k > 2 a, and define 
c = a k jk\. Then for any a > 0 and for any n > k, we have 


a" a a a c c ■ 2 k c ■ 2 k 

n\ k + 1 k + 2 n 2 n ~ k 2” n 


( 2 . 8 ) 


Since (2.8) holds for a sufficiently large n (> k ), it also holds for 
n satisfying n > 2 k c/e, where e is an arbitrarily small number. In 
the latter case, we have 

a n 2 k c 

—r< — < e, 
n! n 

which means that 

a n 

lim x n = lim — = 0. X 

n — »oo n — too Tl\ 


2.3 Infinite Series of Real Numbers 
2.3.1 Limits of Infinite Series 

This section focuses on convergence properties of infinite series. The im- 
portance of this issue will become apparent, particularly in connection with 
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certain branches of functional analysis such as Hilbert space theory and or- 
thogonal polynomial expansions, where infinite series of numbers (or of func- 
tions) enter quite often (see Chaps. 4 and 5). 

To begin with, we briefly review the basic properties of infinite series of real 
numbers. Assume an infinite sequence (ai, 02 ,--- of real numbers. 

We can then form another infinite sequence (Ai, A 2 , ■ ■ ■ ,A n ,---) with the 
definition 

n 

A n — ^ ' Ofc- 

fc = 1 

Here, A n is called the nth partial sum of the sequence ( a n ), and the 
corresponding infinite sequence ( A n ) is called the sequence of partial sums 
of ( a n ). The infinite sequence ( A n ) may or may not be convergent, which de- 
pends on the features of (a„). 

Let us introduce an infinite series defined by 

OO 

y ( dfe = ai + 02 + • • • . (2.9) 

fc= 1 

The infinite series (2.9) is said to converge if and only if the sequence ( A n ) 
converges to the limit denoted by A. In other words, the series (2.9) converges 
if and only if the sequence of the remainder R n+ 1 = A — A n converges to 
zero. When ( A n ) is convergent, its limit A is called the sum of the infinite 
series of (2.9), and we may write 


OO 


12 ak 

k=l 


n 

lim Y a k 

n —> 00 L ' 
k=l 


lim A n = A. 

n — *00 


Otherwise, the series (2.9) is said to diverge. 

The limit of the sequence ( A n ) is formally defined in line with Cauchy’s 
procedure as shown below. 

4 Limit of a sequence of partial sums: 

The sequence of partial sums ( A n ) has a limit A if for any small e > 0, 
there exists a number N such that 

n > N => \A n — A\ <e. (2.10) 


Examples 1. The infinite series f - — 


fc=i 


A n = E 


fc= 1 


1 1 


k k + 1 


k k + 1 
= 1 - 1 


+ 1 


converges to 1 because 
— > 1 (n — > 00 ). 
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OO 

2. The series ^J(—l) fc diverges because the sequence 
fc = i 


= ^(-l) fc 

fe= l 

does not approaclre any limit. 

OO 

3. The series 1 = 1 + 1 + 1 + -- 
k = l 


JO n (is even), 
J — 1 n (is odd) 


• diverges since the sequence 


yj 1 = n increases without limit as n — + oo. 

k=i 


A 


n — 


2.3.2 Cauchy Criterion for Infinite Series 

The following is a direct application of the Cauchy criterion to the sequence 
( A n ), which consists of the partial sum A n = Y^k= l ®fc : 

4 Cauchy criterion for infinite series: 

The sequence of partial sums ( A n ) converges if and only if for any small 
£ > 0 there exists a number N such that 

n,m>N => \A n — A m \ < s. (2.11) 


Similarly to the case of real sequences, the Cauchy criterion alluded to above 
provides a necessary and sufficient condition for convergence of the sequence 
(A n ). Moreover, from the definition, it also gives a necessary and sufficient 
condition for convergence of an infinite series a fc- Below is an important 

theorem associated with the latter statement. 

4 Theorem: 

If an infinite series YlV-i ak convergent, then 

lim a n = 0. 


Proof From hypothesis, we have 

n 

lim \ a*, = lim A n = A. 

n —> oo ' ^ n—>oo 

k=l 


lim a n = lim ( A n — A n _ i) = A — A = 0. 4* 

n— >oo n — >-oo 


Hence, 
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According to the theorem above, lima n = 0 is a necessary condition for 
the convergence of A n . However, it is not a sufficient condition, as shown in 
the following example. 


Examples Let = 1 /y/k. Although lim^oo ak = 0, the corresponding infi- 
nite series ak diverges, as seen from 


n 

Ofc = i + 

k=i 



> 





n 



= y/n 


00 . 


Remark. The contraposition of the previous theorem serves as a divergent 
test of the infinite series in question; we can say that 

OO 

lim a n yf 0 => y ak is divergent. 

n — »oo ' ^ 

k = 1 


2.3.3 Absolute and Conditional Convergence 

Assume an infinite series 

OO 

(2-12) 

k = 1 

and an associated auxiliary series 

OO 

Eh, ( 2 - 13 ) 

*;= i 

in the latter of which all terms are positive. If the series (2.13) converges, 
then the series (2.12) is said to converge absolutely. The necessary and 
sufficient condition for absolute convergence of (2.12) is obtained by replacing 
A n in (2.11) by 

B n = 1 0 - 1 1 + | (Z2 1 + • • • + \dn\- 

If the series (2.13) diverges and the original series (2.12) converges, we say that 
the series (2.12) converges conditionally. These results are summarized by 
the statement below. 
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4 Absolute convergence: 

The infinite series Y a k is absolutely convergent if Y l a fc| is convergent. 

4 Conditional convergence: 

The infinite series Y a k is conditionally convergent if Y a k is convergent 
and is divergent. 


Examples The infinite series 


E 


&= i 


(_l)fe+! 

k 


(2.14) 


converges conditionally, since it converges while its absolute-value series 
YkLi |(— l) fc+1 /fc| = YkLii 1/^) diverges. See Exercises 1 and 2 in this 
section. 


The following is an important theorem that we use many times in the 
remainder of this book. 

4 Theorem: 

An infinite series converges if it converges absolutely. 


Proof Suppose that the series ( B n ) consisting of 

n 

B n — ^ | ®k | 

fc = 1 

converges as n — > oo. This means that for any e > 0 a number N exists such 
that 

n,m>N => \B n — B m \ < e. (2-15) 

Assuming n > m, we rewrite the left-hand inequality in (2.15) as 

| B n — B m | = |flm+l| + 1 | + • • • + |a„| 

> \ct m +l + ®m+ 2 + ' ' ' + Oral 
= \A n — A m \ , (2.16) 

where we used the law of inequalities for sums. Hence, it follows from (2.15) 
and (2.16) that 

u, in N | A n A m | <c 6, 
which means that the series Y a k converges. Jl» 
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The converse of the above theorem is not true. Below we present a well- 
known example of a convergent series that is not absolutely convergent. 

2.3.4 Rearrangements 

Observe that the conditionally convergent series (2.14) expressed by 


1 1111 
1_ 2 + 3~4 + 5~" 
may be rearranged in a number of ways, such as 

, 1111 
1+ 3~2 + 5^4 + ' 


or 


1 


1 1 
7 + 1+ 3 ~~ 


(2.17) 


(2.18) 


(2.19) 


or in any other way in which the terms 1, —1/2, 1/3, —1/4, • • • are added in a 
certain order. Series such as (2.18) and (2.19) are called rearrangements of 
the series (2.17). 

Of importance is the fact that rearranging procedures may change the 
convergence property of a conditionally convergent series; in what way this 
happens depends on the nature of the original series, as we shall now see. 
Suppose a series a n to be conditionally convergent. Then, the sum of its 
positive terms or that of its negative term goes, respectively to +oo or — oo; 
otherwise the original series would diverge or converge absolutely. Let ( b n ) 
and (c„) be, respectively, the subsequences of positive and negative terms of 
(a n ). Since Y^k=i bk is monotonically increasing with respect to n, there is a 
positive integer mi such that 


k = 1 

Here the right-hand side is positive since Ci is negative. We rewrite it as 

mi 

y^ bk + ci > i. 

*;= i 

Similarly, there is an integer m 2 > m\ such that 

m 2 

b k + C 2 > 1 . 


fc=i 


Continue on the same process for 771 . 3 , 7714 , 
side to obtain 


, m n and take the sum of each 


b k + > c fe > n. 


fc= 1 


fc= 1 


( 2 . 20 ) 
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Note that the left-hand side is a partial sum of the rearrangement of the 
sequence (a*,) that may, for instance, take the form of 

(61 , 62 , * * * , , Ci , b mi - j_i , b m 1 -f-2 , * , ^m2 5 C2 , * * * ) • (2.21) 

Clearly, the left-hand side of (2.20) diverges as n — > 00, which means that 
the rearrangement (2.21) diverges. Therefore, the conditionally convergent 
series may become divergent through the rearranging procedure. In fact, the 
discussion above serves as part of the proof of the theorem below. 

6 Riemann theorem: 

Given any conditionally convergent series and any r £ R = RDoo, there 
is a rearrangement of the series that converges to r. 


Proof The case of r = 00 was proved in the previous discussion. Now let 
r £ R and assume that (b n ) and ( c n ) is the subsequence of positive and 
negative terms, respectively, in the same order in which they appear in ( a „ ). 
It is possible to obtain the smallest sum such that 

m 1 

si = ^ b k 

k = 1 

exceeds r. Then, add the least number of negative terms Ck to obtain the 
largest sum. Such that 

mi n 1 

S2 = ^2 b k + ^ Cfc 

fc= 1 k-1 

is less than r. Proceeding in this fashion, we obtain a sequence si, S2, S3, • • • 
that converges to r, since 

lim b n = lim c n = 0. 

n—> 00 n—* 00 

This result is the case for an arbitrary real number r. Hence, the proof is 
complete. A 


Exercises 

1. Determine the convergent property of the series 

OO 1 

( 2 . 22 ) 

k= 1 

This is known as a harmonic series. 
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Solution: Let A n = y ^"_ 1 (1/fc). We then have 


A 2 n — A. 


1 

n + 1 


1 

n + 2 


1 1 

— > — 

2 n 2n 


1 

x n ~ 2 ’ 


which implies that the sequence (H„) is not a Cauchy sequence. 
Thus, view of the Cauchy criterion, the harmonic series (2.22) 
diverges. 4k 

2 . Determine the convergence of the series 


oo 1 

V-. 

A—' k P 


k = l 


(2.23) 


This is called a hyperharmonic series (or zeta function) and is de- 
noted by £(p). 

Solution: When p < 1, a partial sum A 2 n consisting of the first 

2 n terms reads 


A 2 n = 




H 


> 


1 1 
(2 n— 1 + 1 )p ~ l ^ (2 ")p 



H 


1 


1 

> 

“ 2 


|_ (2”— 1 + 1) 

1 „ 1 
T - x 2 + - 
4 8 



x 4 -| h 


1 

2 n 


x 2 


n— 1 


n 

2 ' 


This means that the series (2.23) diverges for p < 1. 
For p > 1, we have 


A 2 n+i_ 1 - l+ (v^ + 7^) + (-r 


2 p 3 p 

1 


4 p 


1 

Jp 


(2") 


p 


(2”+i - 1)? 


< 1 + ~ x 2 T — — x 4 H h — — — - — 

2 p 4p (2 n )P 


x 2^ 


<t(s^ 


/c=0 


1 - (l/2 p-1 ) n+1 2 p_1 

1 — (1 / 2^* 1 ) < 


2P- 1 _ 1 ■ 


Hence, the monotonically increasing sequence {A n } is bounded 
above and is thus convergent. 4k 
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3. Determine the convergence of the series 


E 


*= i 


(_i)fc+t 

k 


(2.24) 


Solution: Let n be an even integer, say n = 2 in. Then, it follows 

that 


2m 

^2m = ^ ^ 

k= 1 


(_l)fc+! 




1 , 1 , , 1 

24 + 445 + " ' + 2m(2m - 1) ’ 


( 1 

\2rn — 1 


1 

2m 


which means that (A 2m ) is increasing with m. In addition, we have 



1, 


which indicates that (A 2m ) is bounded above. Hence, (A 2m ) con- 
verges to a limit A. Further more, since A 2m+ 1 = A 2m +1 / (2m+l) , 
the same discussion as above tells us that the sequence (H 2m +i) 
also converges to the common limit A. By applying the result 
from shuffled sequences (see Exercise 3 in Sect. 2.1.2), we find 
that limH„ exists, so the series (2.24) converges. It is thus proven 
that the series converges conditionally. X 


4. Suppose that the infinite series an< ^ Sfc are both convergent 

absolutely. Let (a?:^) be an infinite sequence in which the terms aibj are 
arranged in an arbitrary order, say, as 


(a 2 &i, ai& 3 , a 2 &4, a$bi, ■ ■ ■). 

Show that the sequence of the partial sums of {aibj) converges absolutely 
regardless of the order of the terms (ijbj. 

Solution: Let m and n be the maximum values of i and j, re- 

spectively, that are involved in the partial sum 3 ) a ib 3 : here 
(i. j) denotes the possible combinations of i and j that are ar- 
ranged in the same order as in thesequence ( aibj ). The partial 
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sum is a portion of the product of the finite sums given by 
i a i ) (S"=i & j) ■ Hence, we have 


< E I^E \ b j\- 

*= 1 i=i 

(2.25) 

From hypothesis, the left-hand side in (2.25) converges as m,n — > 
oo. This means that the partial sum ^ a,6j | is bounded above. 
In addition, it is obviously increasing. Therefore, ^ con- 
verges (i.e., dibj converges absolutely) independently of the 

order of i and j in the sequence of (a,6j). A 




m 


n 

E aib i 

(i,j) 

= E l a A'l = 

(i,j) 

E ai 

i 


3 


5. Show that rearrangements of absolutely convergent series always converge 
absolutely to the same limit. 

Solution: Let a k be absolutely convergent and assume 

that Y^k= i frfc is its rearrangement. Define A n = Y^k=i l°fc|j A = 
lim 7WOC A n , B n = Y^k=\ IM> and let £ > 0. By hypothesis, there 
is an integer N such that | A — Hjv| = |ajv+i| + |a/v+ 2 | + ••■<§. 
Now we choose the integer M so that all the terms Oi, a 2 , • • • , a^r 
appear in the first AI terms of the rearranged series, i.e., within 
the finite sequence (6i, 62, • • • , J>m)- Hence, these terms do not con- 
tribute to the difference B m — An, where m > N. Consequently, 
we obtain 

m> N => | B m - A n \ < |ojv+i| + |ajv+2| + ■■■<- 
=4- \A — B m | < | A — An I + | An — B m \ < e, 

which shows that linin^^ = A A 


2.4 Convergence Tests for Infinite Real Series 

2.4.1 Limit Tests 

This section covers the important tests for convergence of infinite series. In 
general, these tests provide sufficient, not necessary, conditions for conver- 
gence. This is in contrast to the Cauchy criterion, which provides a neces- 
sary and sufficient condition for convergence, though it is difficult to apply in 
practice. The first test to be shown is called the limit test, by which we can 
examine the absolute convergence of infinite series quite easily. 
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4 Limit test for convergence: 

If 

lim k p a k exists for some p > 1, 

k—> oo 

then a k converges absolutely (and thus converges ordinary). 


Proof By hypothesis, we set lim*,—,,*, k p ak = A for certain p > 1, which implies 
that 

lim k p \a k \ = \A\. 

k—> oo 

Hence, there exists an integer m such that 

k > m => k p \a k \ — |H| < 1, 

or equivalently, 

k > m \a k \ < 1 . (2.26) 

k p 

We know that the series converges for any p > 1 (see Exercise 

2 in Sect. 2.3). Thus it follows from (2.26) that the series J2k^m\ a k\ also 
converges, from which the desired conclusion follows at once. X 


There is a counterpart of the limit test for convergence that determines 
divergence properties of series as follows. 

6 Limit test for divergence: 

If 

lim ka k yf 0, 

k — >oo 

then Y^kLi a k diverges. The test fails if the limit equals zero. 


Proof Suppose limfcafc = A > 0. Then there exists an integer m such that 

A 

k > m =$■ ka k > 

Hence, by employing the result from harmonic series (see Exercise 1 in 
Sect. 2.3), we obtain 

1 

H ak > J2 fc = °°> 

k=m k=m 

from which the desired result follows. The same procedure can be applied to 
the case of A < 0, in which case the series (~°fc) ma Y be treated by the 
procedure above. The proof is thus complete. X 
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Remark. 

1. The test is valid even when A goes to infinity. 


2. The divergence test described above is inconclusive when lim hap. = 0. To 
see why, consider the two series 


oo 1 OO 

and E 


i 


k log k 


k — 1 k — 2 

The former converges and the latter diverges, but both yield lim = 0. 


2.4.2 Ratio Tests 


The following provides another test for absolute convergence of infinite series 
that is sometimes easier to use than the previous one. 


Ratio test: 





A series J2T=o a ^ 

converges absolutely (and thus converges ordinary) if 


lim sup 

Ofc+l 

< 1 

(2.27) 


k — >oo 

Ufc 



and diverges if 

lim sup 

Ufe+1 

> 1. 

(2.28) 


k — >oo 

CLk 


If the limit superior is 1, the test is 

inconclusive. 



Remark. When |afc+i/afc| converges, the limits superior used in (2.27) and 
(2.28) reduce to the ordinary limits. 


Proof (i) Suppose that t = lim sup 

k—* oo 

can find the number to such that 


flfc+i 

Ufc 


< 1. Then, for any r (£, 1), we 


k > in 


dk + 1 
Ufc 


< r. 


It follows that 


^ ro-f-l 
Um 


O'm+2 

Um+1 


X 


O'm+p 

Um+p+1 


< r p or equivalently, 


|a?n+p| < r p \a m \, which holds for any p £ N. Hence, we have 
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OO OO OO 

^ ' |Um+p | = y ' \&k I ^ y \ I U m | = ^ ~ |u m | . 

p—1 k—m+1 p= 1 

The last term is a finite constant. Therefore, the series X^fcL m + i l a fe| re_ 
mains finite and the series y2T=o ak converges absolutely. 

(ii) Next we assume that 


lim sup 

k—> oo 


Ofc+l 


= i> 1. 




Then there is an integer m such that 


fc > TO 


Ufe+l 

at 


> 1. 


That is, 

which means that 


k > in 


l^fcl ^ |^m| A 0; 


lim ak ^ 0. 

k—* oo 

In view of the remark in Sect. 2.3.2, the series J2kLo a k diverges. 


* 


2.4.3 Root Tests 

We now give an alternative absolute-convergence test based on examining the 
fcth root of \ak\- 


4 Root test: 

A series Y^kLo ak converges absolutely (and ordinary) if 

lim sup \/W\ < 1 

k—> oo 


and diverges if 


lim sup \/\ak\ > 1- 

k — »oo 

If the limit superior is 1, the test fails and does not provide any information. 


Proof Let r = lim sup We first prove that the series converges abso- 

k—* oo 

lutely if r < 1. We choose a positive number c £ (r, 1). Then there is a positive 
integer N such that 


k > N => 


afc| < c =£■ \a k \ < c k . 
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Since the geometric series J2° k with c < 1 converges, 5^l a fcl converges, so 
that a k converges absolutely. 

When r > 1, it follows from the definition of the limit superior (see 
Sect. 2.1.4) that there are an infinite number of terms of greater than 

1. This implies that limafc ^ 0, which means that the infinite series ^ a k 
diverges. X 

Examples Assume the series 

00 11111 

£° fc = l - 2 + 42“ 2^ + 44 ~& + ”’ • (2 - 29) 

k - 0 

Since 



we have 

limsup = - < 1 . 

k—>oo ^ 

Thus the series (2.29) converges (absolutely and ordinary). 

2.4.4 Alternating Series Test 

All the convergence tests presented so far are tests for absolute convergence, 
which assumes ordinary convergence. Nonetheless, certain kinds of series can 
exhibit conditional convergence, i.e., ordinary convergence with absolute di- 
vergence, whose convergence properties cannot be addressed by the tests given 
thus far. Hence, the significance of the test described below, known as the 
alternating series test, is that it may be used to test the conditional con- 
vergence of some absolutely divergent series. 

We say that ( Xk ) is an alternating sequence if the sign of Xk is different 
from that of Xk+i for every k. The resulting series x k is called the alter- 
nating series, whose convergence properties are partly determined by the 
following theorem: 

6 Alternating series test: 

An alternating series given by 

OO 

ai — 02 + 03 — 04 + • • • = ^^(— l) fe+1 afe with a*, > 0 for all k 

fc= l 

converges if 

afc > ak + i and lim ak = 0 . 

k — >oo 
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Proof First we show that the sequence of partial sums S n converges. It follows 
that 

A 2 n — (&1 ~ 02) + (o 3 — 0 . 4 ) + • • • + («2n-l + &2 n) ■ 

Since cik — cik+i > 0 for all k, the sequence A 2 n is increasing. It is also bounded 
above because 

A - 2 n = 0 \ — (tt2 — 03) — (04 — 05) — • • • — (a2n-2 + d2n-l) — 0 , 2 n < Ol ( 2 . 30 ) 

for all n £ N. Thus, lirn A 2 n exists and we call it A. On the other hand, we 
have 

|^2n+l — A\ = \A2n02n+l — A\ < \A2 n — A\ + |02n+l|. 

In the limit as n — * oo, the left-hand side vanishes so that we obtain 
lim A‘ 2 n -]-i = A. Therefore, we conclude that S n — » S X 


Exercises 


~ (fc + 1) 1 / 2 

1. Show that ^ k 5 + k 3_ 1 y/3 C01wer S es - 


Solution: Taking p = 7/6 > 1 into the limit test for convergence, 
we have 


lim k 7 ^ e a,k = lim 

k — »oo k — »oo 


(l + fc - 1 ) 1 / 2 

(1 — k ~ 2 + A: -5 ) 1 / 3 


1 . * 


lo k 

2. Show that y^(— l) fc t9 converges. 

rv 

fc = i 

Solution: With use of the limit test for convergence by taking 

p = 3/2, we obtain 

lim k 3 / 2 dk = lim (— 1)* ^°^ =0. X 

k—> oo k—>oo yj fc 


3 . 


Show that 


fclogfc 

^ 1 + fc 2 


diverges. 


Solution: From the limit test for divergence, we have 
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4. Show that ~ converges. 

t'o 

Solution: The ratio test yields 


Ofc+i _ (2fc)! [(k + l)!] 2 _ (k + l) 2 

~ (fc!)2 (2fc + 2)! _ (2k + 2) (2k + 1) 

(l + il 2 1 


5. Show that ^ f 1 + - 


converges. 


Solution: The root test yields 


[i + (i/fc)r 
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Real Functions 


Abstract Infinite sequences and series of real functions are encountered frequently 
in mathematical physics. The convergence of such sequences and series does not 
generally preserve the nature of their constituents; e.g., a sequence of “continuous” 
functions can converge into a “discontinuous” function. In this chapter, we show 
that this is not true in cases of uniform convergence (Sect. 3.2.2), which is a special 
class of convergence that preserves the continuity, integrability, and differentiability 
of the constituent functions of sequences and series, as we explain in detail in Sects. 
3. 2. 4-3.2. 6. 


3.1 Fundamental Properties 

3.1.1 Limit of a Function 

Having discussed the limits of sequences and series of real numbers, we now 
turn our attention to the limit of functions. Let A be a real number and f(x) 
a real-valued function of a real variable x € R. A formal notation of the above 
function is given by the mapping relation / : R — > R. The statement “the 
limit of /( x) at x = a is A ” means that the value of f(x) can be set as close 
to A as desired by setting x sufficiently close to a. This is stated formally by 
the following definition. 


4 The limit of a function: 


A function /( x) is said to have the limit A as x — 

■» a if and only if for 

every e > 0, there exists a number 5 > 0 such that 


\x — a \ < 5 => /( x) — A\ < s. 

(3.1) 


The limit of /( x) is written symbolically as 


lim /( x) = A 
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or 

f(x) — * A for x — » a. 

If the first inequality in (3.1) is replaced by 0 < x — a < 8 (or 0 < a — x < 6), 
we say that /( x) approaches A as x a from above (or below) and write 

f(x) = A 

This is called the right-hand (or left-hand) limit of f(x). The two together 
are known as one-sided limits. 

A necessary and sufficient condition for the existence of liiri x ^ 0 f(x) is 
shown below. 


lim f(x)=A 

x^a-\- 


or lim 


6 Theorem: 

The limit of f{x) at x = a exists if and only if 
lim f(x) = lim f(x). 

x — >a-\- x — >a— 


(3.2) 


Proof If lim x ^ 0 f(x) exists and is equal to A, it readily follows that 

lim f(x) = lim f(x) = A. (3.3) 

x — >a-\- x — >a— 

We now consider the converse. Assume that (3.2) holds. This obviously means 
that both one-sided limits exist at x = a. Hence, given e > 0, we have > 0 
and S 2 > 0 such that 

0 < x — a < Si => \f(x) — A\ < s, 

0 < a — x < 82 => | f(x) — A\ < e. 

Let S = min{<5i, i 5 2 }- If x satisfies 0 < \x — a\ < 8 , then either 

0<a: — a < 8 < 8 \ or 0<a — x < 8 < 82 - 

In either case, we have | fix) — A\ < e. That is, we have seen that for a given 
£, there exists 8 such that 

0 < \x — a| < 8 => | f{x) — A\ < £. 

Therefore we conclude that 

Equation (3.2) holds => lim f(x) = A, 


and the proof is complete. 
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3.1.2 Continuity of a Function 

In general, the value of lim^^o f{x) has nothing to do with the value (and 
the existence) of /(a). For instance, the function given by 

/m={ 2 / 

gives 

lim f{x) = 0 and /( 1) = 2, 

x — >1 

which are quantitatively different from one another. This mismatch occurring 
at x = 1 results in a lack of geographical continuity in the curve of y = /(#), 
as depicted in Fig. 3.1. In mathematical language, continuity of the curve of 
y = f(x) is accounted for by the following statement. 



Fig. 3.1. A discontinuous function y = f(x) at * = 1 


4 Continuous functions: 

The function f(x) is said to be continuous at x = a if and only if for 
every e > 0, there exists S > 0 such that 

\x - o| < 5 =4- \f(x) - /(a) I < £- 


Remark. The definition noted above seems to be similar to the definition of the 
limit of f(x) at x = a (see Sect. 3.1.1). However, there is a crucial difference 
between them. When considering the limit of f(x) at x = a, we are only 
interested in the behavior of f(x) in the vicinity of the point a, not just at a. 
However, the continuity of f{x) at x = a requires the further condition that 
the value of f(x) just at x = a has to be defined. In symbols, we write 

f(x) is continuous at x = a => lim f(x) = lim f(x) = f(a). 

x—*a — 0 x->a-\- 0 
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We must emphasize that given a function /( x) on a domain D , the limit of 
fix) is defined at limit points in D that may or may not lie in D. In contrast, 
the continuity of /( x) is defined only at points contained in D. An illustrative 
example is given below. 

Examples Assume a function given by 

fix) = x for all but x = 1. 

It has a limit at x = 1, 

lim f(x) = 1, 

x—*oo 

but there is no way to examine its continuity because x = 1 is out of the 
defining domain. 

When fix) is continuous, we can say that f(x) belongs to the class of functions 
designated by the symbol C. Then, it follows that 

f{x) € C at x = a 4=>- lim fix) = /(a). 

x — >a 

If the symbol x — > a appearing in the right-hand statement is replaced by 
x — •> a+ (or x — > a—), fix) is said to be continuous on the right (or left) 
at x = a. We encounter the latter kind of a limit particularly when we consider 
the continuity of a function defined within a finite interval [a, b]; we say that 

fix) £ C on [a, b } 4=> 

f(x) e C on (a, b) and lim f(x) = f(a), lim f(x) = f(b). 

x— »a+ x^b— 

We also say that a function f(x ) on [a, b] is piecewise continuous if 
(i) fix) is continuous on [a, b } except at a finite number of points x±, £ 2 , • • • , 

m 

(ii) at each of the points xi,X 2 ,--- ,x n , there exist both the left-hand and 
right-hand limits of f(x) defined by 

f(x k - 0) = lim f{x), fixk + 0) = lim fix). 
x=Xk~ 0 x=Xk~\~ 0 


3.1.3 Derivative of a Function 


The following is a rigorous definition of the derivative of a real function. 


4 


Derivative of a function: 

If the limit 

lim 

x — >a 


fjx) - fja) 
x — a 


exists, it is called the derivative of fix) at x = a and is denoted by /'(a). 
The function fix) is said to be differentiable at x = a if f\a) exists. 
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Similar to the case of one-sided limits, it is possible to define one-sided 
derivatives of real functions such as 


f(a+) = lim 

x— >a-\- 

f'(a—) = lim 


f{x) - /(a) 
x — a 

/O’) - /(a) 

x — a 


4 Theorem: 

If f{x ) is differentiable at x = a, then it is continuous at x = a. (The 
converse is not true.) 


Proof Assume x ^ a. Then 

ft \ rr \ /0) ~ f( a )r \ 

/0) - f{a) = 0 - a). 

x — a 

From hypothesis, each function [f(x) — f(a)]/(x — a) as well as x — a has the 
limit at x = a. Hence, we obtain 

lim [f(x) — /(a)] = lim liffl . li m (a; — a) = /'(a) x 0 = 0. 

x — >a x — >a X — CL x — >a 

Therefore, 

lim /( x) = / (a), 

x — >a 

i.e., f(x) is continuous at x = a. That the converse is false can be seen by 
considering f(x) = |x|; it is continuous at x = 0 but not differentiable. £ 

The term C n functions is used to indicate that all the derivatives on the order 
of <n exist; this is denoted by 

f(x)eC n <=> f {n) (x)eC. 

Such an f(x) is said to be a C n function or to be of class C n . 

Examples 1. f(x) = | ° 

=> f(x) £ C°(= C ), but f(x) fL C 1 at x = 0. 


2- f(x) 


0 x < 0 
x 2 x > 0 


=> f(x) £ C 1 , but f(x) g C 2 at x = 0. 


3. Taylor series expansion for functions / £ C" is given by 


k<r 


/(• r ) - 2 ^ k \ dx k 


(x - x 0 ) k + o( \x- x 0 \ n ) . 


X=Xq 
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3.1.4 Smooth Functions 

We now introduce a new class of functions for which the derivative is contin- 
uous over the defining domain. 

6 Smooth functions: 

The function f(x) is said to be smooth for any x £ [a, b] if f'(x) exists 
and is continuous on [a, b}. 


In geometrical language, the above statement means that the direction of the 
tangent changes continuously, without jumps, as it moves along the curve 
y = f(x) (see Fig. 3.2). Thus, the graph of a smooth function is a smooth 
curve without any point at which the curve has two distinct tangents. 

Similar to the case of piecewise continuity, the function f(x) is said to 
be piecewise smooth on the interval [a, b] if f(x) and its derivatives are 
all piecewise continuous on [a, b}. The graph of a piecewise smooth function 
is either a continuous or a discontinuous curve; furthermore, it can have a 
finite number of points (called corners) at which the derivatives show jumps 
(see Fig. 3.2). Every piecewise smooth function f(x) is bounded and has 
a bounded derivative everywhere, except at its corners and points of dis- 
continuity; f'(x) does not exist in the sense of continuity at any of these 
points. 




(a) (b) 

Fig. 3.2. (a) A continuous function y = f(x). (b) A piecewise smooth function 
y = f(x) having two discontinuous points and one corner 


3.2 Sequences of Real Functions 

3.2.1 Pointwise Convergence 

In this section we focus on convergence properties of sequences consisting of 
real-valued functions of a real variable. Suppose that for each n £ AT, we have 
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a function f n (x) defined on a domain D C R. We then say that we have a 
sequence 

(fn(x) ■ n € N) 

of real- valued functions on D. If the sequence (f n (x)) converges for every 
x £ D, the sequence of functions is said to converge pointwise on I). and 
the function defined by 

f(x) = lim /„( x) 

n — >oo 

is called the pointwise limit of (f n (x)). The formal definition is given below. 

4 Pointwise convergence: 

The sequence of functions (/„) is said to converge pointwise to / on 
D if, given e > 0, there is a natural number N = N(e, x) (which depends 
on e and x) such that 

n > N => \f n (x) - f(x)\ < e. 



Fig. 3.3. Converging behavior of fn(x) = x n given in (3.4) 


Examples Assume a sequence (/„) consisting of the function 

fn{x) = x n 


(3.4) 


that is defined on a closed interval [0, 1]. It follows that the sequence converges 
pointwise to 
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(3.5) 


f(x) = lim /„( x) 

n—> oo 


0 for 0 < x < 1, 

1 at x = 1. 


See Fig. 3.3 for the converging behavior of f n (x) with increasing n. 

The important point is the fact that under pointwise convergence, conti- 
nuity of functions of f n {x) is not preserved. In fact, f n (x ) given in (3.4) is 
continuous for each n over the whole interval [0,1], whereas the limit f(x) 
given in (3.5) is discontinuous at x = 1. This indicates that interchanging 
the order of the limiting processes under pointwise convergence may produce 
different results, as expressed by 


lim lim f n (x ) ^ lim lim f n {x). 

x — ► 1 n—* oo n—> oo x — ► 1 

Similar phenomena might occur in connection with, integrability and differ- 
entiability of terms of functions f n (x). That is, under pointwise convergence, 
the limit of a sequence of integrable or differentiable functions may not be 
integrable or differentiable, respectively. Illustrative examples are given in 
Exercises 1 and 2 in Sect. 3.2. 


3.2.2 Uniform Convergence 

We know that if the sequence (/„( x)) is pointwise convergent to f(x) on 
x € D, it is possible to choose N(x) for any small e such that 

TO > N(x) \.fm(x) - f(x)\ < £. (3.6) 

In general, the least value of N(x) that satisfies (3.6) will depend on x. But in 
certain cases, we can choose N independent of x such that \f m (x) — f(x) \ < e 
for all to. > N and for all x over the domain D. If this is true for any small 
e, the sequence ( f n {x )) is said to converge uniformly to f(x ) on D. The 
formal definition is given below. 

6 Uniform convergence: 

The sequence (/,,) of real functions on D C R converges uniformly 
to a function / on D if, given £ > 0, there is a positive integer N = N{e) 
(which depends on s) such that 

n > N =>■ | fn(x) — f(x) | < £ for all x G D. 


Emphasis is placed on the fact that the integer N = N(e, x) in the point- 
wise convergence depends on x in general, whereas N = N(e) in the uniform 
convergence is independent of x. Under uniform convergence, therefore, by 
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taking n large enough we can always force the graph of y = f n (x) into a band 
of width less than 2 s centered around the graph of y = f{x) over the whole 
domain D (see Fig. 3.4). 



Fig. 3.4. A function y = f n (x) contained overall within a band of width less than 


The definition of uniform convergence noted above is equivalent to the 
following statement. 

Theorem: 

The sequence (/„) of real functions on D C R converges uniformly to / 
on D if and only if 

sup | f n (x) - f(x ) | = 0 as n — ^ oo. 

xGD 


3.2.3 Cauchy Criterion for Series of Functions 

As in the case of real sequences, the Cauchy criterion is available for testing 
uniform convergence for sequences of functions. 

4k Cauchy criterion for uniform convergence: 

The sequence of f n defined on D C R converges uniformly to / on D if 
and only if, given e > 0, there is a positive integer N = N(e) such that 

m,n > N => \f m (x) — f n (a :)| < e for all x € D, (3.7) 

or equivalently, 

m,n> N => sup | f m (x) - f n (x) \ < e. 

xGD 
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Proof Suppose that f n (x) converges uniformly to f{x) on D. Let £ > 0 and 
choose N £ N such that 

n> N => \fn(x) — f(x) | < ^ for all x £ D. 

If to, n > iV , we have 

I fn{x) - fm(x) | < \fn{x) - /(a;)| + |/(a;) - f n (x)\ < £ for all x S D. 

This result implies that if f n (x) is uniformly convergent to f(x) on D , there 
exists an N that satisfies (3.7) for any small s. 

Next we consider the converse. Suppose that (/„) satisfies the criterion 
given by (3.7). Then, for each point of x £ D, (f n (x)) forms a Cauchy sequence 
and thus converges pointwise to 

f(x) = lim f n (x) for all x € D. 

n— >oo 

We now show that this convergence is uniform. Let n > N be fixed and take 
the limit to — » oc in (3.7) to obtain 

n > N => | f n (x) — f(x) | < £ for all x € D, 

where N is independent of x, from which we conclude that the convergence 
of (/„) to f is uniform. Jl» 

3.2.4 Continuity of the Limit Function 

The most important feature of uniform convergence is that it overcomes some 
of the shortcomings of pointwise convergence demonstrated in Sect. 3.2.1; i.e., 
pointwise convergence does not preserve continuity, integrability, and differen- 
tiability of terms of the functions f n (x). We now examine the situation under 
uniform convergence, starting with the continuity of f n (x). 

4 Theorem: 

If f n converges uniformly to / on D C _R, then, if f n is continuous at 
c € D, so is /. 


Remark. Note that the uniform convergence of f n on D is a sufficient, but not 
a necessary, condition for / to be continuous. In fact, if f n is not uniformly 
convergent on D , then its limit / may or may not be continuous at c £ D. 

For the proof, it suffices to see that 

lim /( x) = lim lim f n (x) = lim lim f n (x) = lim /„(c) = /(c), (3.8) 

x — >c x — >c n — >-oo n — >-oo x — »c n — >-oo 
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which guarantees the continuity of the limit function f(x) at x = c. In (3.8), 
we have used the interchangeability of limiting processes expressed by 

lim lim f n (x) = lim lim f n (x), 

x — >c n — »oo n — »oo x — >c 

which follows from the lemma below. 


4 Lemma: 

Let c be a limit point oiDC.Fl and assume that f n converges uniformly 
to / on D\{c}. If 

lim f n (x)=£ n (3.9) 

X — >C 

exists for each n, then 
(i) (£ n ) is convergent, and 

(ii) linage f(x) exists and coincides with lim n ^ 00 £ n ; i.e. , 

lim lim f n (x) = lim lim f n (x). (3.10) 


Proof Let e > 0. Since (/„) converges uniformly on D\{c}, it satisfies the 
Cauchy criterion; i.e., there is a positive integer N such that 

m,n> N => \f n (x) ~ f m (x)\ < e for all a: G D\{c}. (3.11) 

Take the limit x — > c in (3.11) to obtain 

m,n>N => \£ n — £ m \ < e. (3-12) 

This implies that (£ n ) is a Cauchy sequence and thus convergent, which proves 
statement (i) above. 

To prove (ii), let 

£ = lim £ n . 

n — xx) 

Set n = N and m — > oo in (3.9), (3.11), and (3.12) to set the following results: 


lim f N (x) = £ n , 

(3.13) 

X — >C 

\fwix) - f(x) 1 < £ for all x G D\{c}, 

(3.14) 

and 

\£n — £\ <e. 

(3.15) 

In addition, the existence of (3.13) implies that there exists a 

6 > 0 such that 


\x — c\ <6 with x G D\{c} |/jv(x) — £n\ < £• (3.16) 
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Using (3.14), (3.15) and (3.16), we obtain 
\x — c\ < <5 with x € -D\{c} 

=> I f(x) -t\< | f{x) - f N {x) I + \fw(x) - l N \ + \e N -i\< 3e. 
This means that 


lim f(x) = £, 


which is equivalent to the desired result of (3.10). £ 


Remark. The contraposition of the theorem tells us that if the limit function 
/ is discontinuous, the convergence of f n is not uniform. The example in 
Sect. 3.2.1 demonstrated such a sequence. 


3.2.5 Integrability of the Limit Function 


We know that the limit function /( x) becomes continuous if the sequence 
( f n (x )) of continuous functions is uniformly convergent. This immediately 
results in the following theorem. 


6 Theorem: 

Suppose f n be integrable on [a, b] for each n. Then, if f n converges 
uniformly to / on [a, 6], the limit function / is also integrable, so that 



lim / f n {x)dx , 
n *°° Ja 


(3.17) 


or equivalently, 


nb pb 

/ lim fn(x)dx = lim / f n (x)di 
Ja n ^°° n *°° J a 


Proof Since f n for every n is integrable on [a, 6], it is continuous (piecewise, 
at least) on [a, b\. Thus f(x) is also continuous (piecewise at least) on [a, b] in 
view of the theorem given in Sect. 3.2.4, so that f(x) is integrable on [a, b\. 
Furthermore, we observe that 


f n {x)dx - / f{x)dx 


< f \fn(x) - f{x)\dx 
J a 

< f Slip \f n {x) ~ f(x)\dx 

Ja x(z[a,b\ 

<(b-a) sup \f n (x)-f(x)\ 

x£ [ a,b ] 
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The uniform convergence of (/„) ensures that 

sup | f n (x) - f(x) | — > 0 as n — > oo, 

xG [a, 6] 

which immediately gives the desired result shown in (3.17). X 

Remark. 

1. Note again that uniform convergence is a sufficient but not a necessary 
condition for (3.17) to be valid, so (3.17) may be valid even in the absence 
of uniform convergence. For instance, the convergence of (/„) with f n (x) = 
x n on [0, 1] is not uniform but we have 

[ f n (x)dx = [ x n dx= — > 0 = [ f(x)dx. 

Jo Jo n+1 Jo 

2 . The conditions on f n stated in the theorem will be significantly relaxed 
when we take up the Lebesgue integral in Chap. 6. 


3.2.6 Differentiability of the Limit Function 


After the last two subsections, readers may expect that results for differentia- 
bility will be similar to those for continuity and integrability; i.e., they may be 
tempted to conclude that the differentiability of terms of functions f n (x) will 
be preserved if (/„) converges uniformly to /. However, this is not the case. 
In fact, even if f n converges uniformly to / on [a, b] and f n is differentiable 
at c £ [a, 6], it may occur that 

lim fn(c) ± /'(c)- 

n— >■ OO 

Consider the following example: 

Examples Suppose the sequence (/„) is defined by 


fn{x) = \jx 2 + ^ , x G [-1,1]. (3.18) 

Clearly (3.18) is differentiable for each n, and the sequence (/„) converges 
uniformly on [—1,1] to 

/( x) = M (3.19) 

since 


\fn(x) - f(x) | 



< 


1 


X 2 + + Vx 2 

n z 


n 


0, for all x G [—1, 1]. 
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However, the limit function / of (3.19) is not differentiable at x = 0. Hence, 
the desired result 

lim f' n { x) = f'{x) (3.20) 

n— ► oo 

breaks down at x = 0. 

The following theorem provides sufficient conditions for (3.20) to be sat- 
isfied. The important point is that it requires the uniform convergence of the 
derivatives f' n , not of the functions /„ themselves. 

Theorem: 

Suppose (/„) to be a sequence of differentiable functions on [a, b } that 
converge at a certain point Xq £ [a, b]. If the sequence (f' n ) is uniformly 
convergent on [a, 6], then 

(i) (/„) is also uniformly convergent on [a, b\ to /, 

(ii) / is differentiable on [a, 6], and 

(iii) f' n {x ) = f(x). 


Proof Let e > 0. From the convergence of (/„( xo)) and the uniform conver- 
gence of (fn), we conclude that there is an N £ N such that 

m, n > N => \fn(x) — f m (x) | < £ for all x £ [a, b] (3.21) 

and 

m,n> N => \f n (x 0 ) - f m (xo)\ < £• (3.22) 

Given any two points x,t £ [a, 6], it follows from the mean value theorem 
applied to f n — f m that there is a point c between x and t such that 

fn(x) - f m (x) - [, fn(t ) - f m (t)] = (x - t) [f n (c) - f' m (c)} . 

Using (3.21), we have 

m, n> N => \fn(x) - f m (x) - [f n (t) - f m (t)]\ < e\x - t\. (3.23) 

From (3.22) and (3.23), it follows that 

I fn(x) - fm(x) I < I fn(x) - fm(x) - [f n (x 0 ) - fm{x 0 )]\ + \fn(xo) ~ fm(x o) | 

< e\x - a; 0 | + £ 

< s(b — a + 1) = Ce, for all x £ [a, b\, 


Which means that (/„) converges uniformly to some limit /. Hence, statement 
(i) has been proven. 
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Next we consider the proofs of (ii) and (iii) . For any fixed point x € [a, 6], 
define 

fn{t)= fn{t) -_ f ^ X \ t€[a,b]\{*} 

and 

t€[a,b]\{x}. 

Clearly, /„ — » g as n — » oo; furthermore, if to, n > AT, the result of (3.23) tells 
us that 

I fn(t) - fm(t) I < £ for all t G [a, 6]\{x}. 

Thus in view of the Cauchy criterion, we see that f n converges uniformly to 
g on [a, 6]\{x}. Now we observe that 

lim f n (t) = f'(x) for all n G TV. (3.24) 

t — >X 


Then, uniform convergence of /„ ensures taking the limit of n — > 
followed by interchanging the order of the limit processes, which 

lim lim f n {t) = lim g(t) = lim — — = f'(x) = lim 

n —* oo t — *x t — >x t—>x t — X n—*oo 


This proves that / is differentiable at x and that 


oo in (3.24) 
yields 

fn(x). 


/'( x) = lim f n {x). X 

n—> oo 


Remark. That the uniform convergence of {f' n ) is just sufficient, not necessary, 
is seen by considering the sequence 

2 , 71+1 

fn(x) = x G (0, 1). 

n + 1 

This converges uniformly to 0, and its derivative f' n (x) = x n also converges to 
0. The conclusions (i) (iii) given in the theorem above are thus all satisfied. 
But the convergence of {f' n ) is not uniform. 


Exercises 

1. For the function 


fn(x) = nx{l - x 2 ) n , ar G [0, 1] , 

check that an interchange of the order of the limiting process n — » oo 
and integration gives different results. 
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Solution: The given function is integrable for each n so that 

[ f n (x)dx = n f x(l — x 2 ) n dx = — — (1 - x 2 ) n+1 
Jo Jo [2(n + l) J 0 

n 1 

= 2(n + 1) ^ 2' 

On the other hand, the limit given by 

f{x) = lim fn(x) = 0 for all x € [0, 1] 

n— >oo 

yields f* f(x)dx = 0. We thus conclude that 

lim / f n {x)dx ^ / lim f n (x)dx; 
n^oo J 0 J 0 n—>co 

i.e. , interchanging the order of integration and limiting processes 
is not in general allowed under pointwise convergence. X 

2. For f n (x) given by 

f - 1 x <^b 

f n (x)= sin(^) 

i 1 x >h 

check the continuity of its limit f(x) = linin^oo f n (x) at x = 0. 

Solution: f n (x) is differentiable for any x € R for all n, and thus 
is continuous at x = 0 for all n. However, its limit, 

( — 1 x < 0, 
f(x) = < 0 x = 0, 

[ 1 x > 0 

is not continuous at x = 0. Hence, for the sequence of functions 
{/„( #)}, the order of the limiting process n — » oo and the differ- 
entiation with respect to x is not interchangeable. X 

3. Show that the sequence of functions (/ n (x)) defined by 

fn{x) = nxe~ nx (3.25) 

converges uniformly to f{x) = 0 on x > 0. 
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Solution: In view of the previous theorem, we show that 

sup {.fn(x) '■ x > a} = 0 as n — » oo, 

where a > 0. To prove this, we consider the derivative 

f n '(x) = ne~ nx (l — nx). (3.26) 

It follows from (3.26) that x = 1/n is the only critical point of /„. 
Now we choose a positive integer N such that a > 1/N. Then, the 
function f n for each n > N has no critical point on x > a, and 
is monotonically decreasing. Therefore, the maximum of f n (x) is 
attained at x = a for any n > N, with the result that 

sup /„(x) = /„(a) = nae -rm -> 0 (n -> oo). 

xE [a,oo) 

This holds for any a > 0; hence, we conclude that /„ converges 
uniformly to 0 on (0, oo), i.e., on x > 0. A 

Remark. Note that the range of uniform convergence of (3.25) is the open 
interval (0, oo), not the closed one [0,oo). Since in the latter case we have 

sup f n (x) = f n = 1 0, 

xG[0,oo) W ^ 

it is clear that (/„) does not converge uniformly on [0,oo). 


3.3 Series of Real Functions 

3.3.1 Series of Functions 

We close this chapter by considering convergence properties of series of real- 
valued functions. Assume a sequence (/„) of functions defined on D C R. By 
analogy with series of real numbers, we can define a series of functions by 

n 

S n (x) = ^2fk(x), xeD, 

k = 1 


which gives a sequence (S n ) = (Si, S 2 , ■ ■ ■ )• 

As n increases, the sequence ( S n ) may or may not converge to a finite 
value, depending on the feature of functions fk(x) as well as the point x 
in question. If the sequence converges for each point x £ D (i.e., converges 
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pointwise on D), then the limit of S n is called the sum of the infinite series 
of functions fk(x) and is denoted by 

OO 

S(x) = lim S n (x) = 'Y' fk{x), x e D. 

n—> oo L ^ 

k=l 


It is obvious that the convergence of the series S n (x) implies the pointwise 
convergence lim^oo f n (x ) = 0 on D. A series ( S n ) that does not converge at 
a point x € D is said to diverge at that point. 

Applied to series of functions, the Cauchy criterion for uniform convergence 
takes the following form: 


4 Cauchy criterion for series of functions: 

The series S n is uniformly convergent on D if and only if for every small 
£ > 0, there is a positive integer N such that 


n > m > N 
=> |5„(x) - S m (a;)| 


fk{x) 

k=m -\- 1 


< £ 


for all x € D. 


Set n = m + 1 in the above criterion to obtain 


n > N => | / n (a:)| < £ for all x € D. 

This results implies that the uniform convergence of f n {x) — > 0 on D is a 
necessary condition for the convergence of S n (x) to be uniform on D. We will 
use this theorem when proving a more practical test for uniform convergence 
known as the Weierstrass M- test, which is presented in Sect. 3.3.3. 


3.3.2 Properties of Uniformly Convergent Series of Functions 

When a given series of functions ^2 fk(x) is uniformly convergent, the proper- 
ties of the sum S(x) in terms of continuity, integrability, and differentiability 
can be easily inferred from the properties of the separate terms fk(x). In fact, 
applying the theorems given in Sects. 3. 2. 4-3. 2. 6 to the sequence (S n ) and 
using the linearity regarding the limiting process, integration, and differenti- 
ation, we obtain the parallel theorems shown below. 

Continuity of the sum: 

Suppose fk(x) to be continuous for each k. If the sequence (S n ) of the 
series 

n 

Sn{x) = y^/fcQ) 
k = 1 
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converges uniformly to S(x), then S(x) is also continuous, so that 

OO OO 

lim S(t) = lim fk(t) = E } im /&(*)■ 

t—>x t—*x z J L ^ t—*x 

k = 1 k = 1 


4 Integrability of the sum: 

Suppose fk to be integrable on [a, b ] for all k. If (S n ) converges uniformly 
to S on [a, 6], we have 



4 Differentiability of the sum: 

Let fk be differentiable on [a, b] for each k and suppose that (S n ) con- 
verges to S at some point Xo € [a, 6]. If the series J2 f'k uniformly con- 
vergent on [a, b], then S n {x) is also uniformly convergent on [a, b] and the 
sum S(x) is differentiable on [a, fr], so that 


d 

dx 


S(x) 


d 

dx 


^2fk(x) 

_k = 1 


OO 


E 


dfk{x) 

dx 


for all x G [a, b\. 


Observe that the second and third theorems provide a sufficient condition for 
performing term-by-term integration and differentiation, respectively, of an 
infinite series of functions. Without uniform convergence, such term-by-term 
calculations do not work. 

3.3.3 Weierstrass M - test 

The following is a very useful and simple test for the uniform convergence of 
a series of functions. 


4 Weierstrass M test: If there is a sequence of positive constants M k 
for any x on the interval [a, b } such that 

\f k (x)\<M k (3.27) 

and if the series 

OO 

Mk (3.28) 

fc= o 

converges, then the series of functions n fk{x ) converges uniformly on 

x € [a, b] . 
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Proof Since (3.28) converges, it follows from the Cauchy criterion that for any 
£ > 0 there exists a number N such that 

n m n 

n > to > N =4> Y^ M k — Y^ Mfc = Y] Mfc < £. (3.29) 

/c— 0 /c=0 k=m 

Furthermore, in view of the inequality rule for absolute values of sums and 
the relation (3.27), it follows that 

n n n 

Y h{x) < y \fk(x)\ < y M k (3.30) 

k=m k—m k—m 

for all x € [a, b\. Note that the left-hand term in (3.30) can be rewritten as 

n n m 

Y = YfcW - ■ (3.31) 

k—m k — 0 k — 0 

From (3.29), (3.30), and (3.31), it follows that 

n m 

n>m>N => Y^ fk{x) — 'Y / fk{x) < £ for all x G [a, b\, 
fc= 0 k—0 

which clearly indicates the uniform convergence of ^ fk {x) on [a, b } . A 


Exercises 

oo 

1. Determine the convergence of the series 'Y j x k . 

k—0 

Solution: It obviously converges to 1/(1 — x) on the interval 

[—a, a] if 0 < a < 1. We show that this convergence is uniform on 
[—a, a] for any 0 < a < 1 . A partial sum yields S n (x) = Y^k=o xk = 
(1 — x n )/[\ — x), so that 

\r\ n c n 

1*5(2;) - 5 n (a;)| = < for \x\ < a. 

|1 — x| 1 — a 

Since 0 < a < 1, the last term decreases monotonically with n; 
hence, for a given £ > 0, we can find an N such that n > N => 
a n /{ 1 — a) < £. Clearly the value of N does not depend on x. 
Therefore, we conclude that the infinite series )T) x k is uniformly 
convergent on [—a, a] with 0 < a < 1. A 
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OO 

2. Determine the convergence of the series ^^(1 — x)x k . 

k — o 


Solution: This converges to 


S{x) 


1, for 0 < x < 1 
0, at x = 1 


but not uniformly. Actually, we have 


\S(x) - S„(a;)| 


x n 0 < x < 1 

0 x = 1 


and if e = 1/4, for instance, the inequality x n < 1/4 (0 < x < 1) 
is false for every fixed n because x n — > 1 as x — > 1 . X 


3. Examine the uniform convergence of the series E fk(x), where 

(i) fk{x) = (ii) f k (x) = sin (^), and (iii) f k ( x) = ■ 

Solution: 

(i) The series converges uniformly for every real x. Check this 
by taking M k = 1/fc 2 . 


(ii) Let D be a subset of R bounded by c, i.e. , \x\ < c for all 
x € D. Then we have 



x c 


for all x £ D. 


Taking M k = c/fc 2 and noting that ^2 Mk is convergent, we 
conclude that 22 fk is uniformly convergent on any bounded 
subset of R. Notably, however, this uniform convergence dis- 
appears when we extend the domain D to the whole R. This 
is seen by noting that /& — > 0 pointwise on i?, but 


sup \fk{x)\ > 
x^R 



l/>0, 


which means that the convergence of ( f k ) to 0 is not uniform 
on R. In view of the theorem in Sect. 3.3.1, therefore, the 
series 22 fk foils to converge uniformly on R. 


(iii) The series 22 k l/{k 2 x 2 ) clearly converges pointwise on the 
open set H\{0}. Now let c > 0. For all x G R such that |x| > c, 
we have \fk(x)\ < l/(fc 2 c 2 ) for all k. Since 22k^/(k 2 c 2 ) is 
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convergent, the series fk converges uniformly, by the At- 
test , on the closed set R\(—c,c ) = (— oo, — c] U [c, oo) for all 
c > 0. But, although fk — > 0 pointwise on il\{0}, we have 
sup^o \fk(x)\ > |./fe(l/fc)| = 1 -h 0. Hence, (fk) does not 
converge uniformly to 0 on H\{0}, so the series fk does 
not converge uniformly on i?\{0}. X 


3.4 Improper Integrals 

3.4.1 Definitions 

Suppose that a given function f(x) is integrable on every open subinterval 
of (a, 6). We try to perform the integration f a f(x)dx under the following 
conditions: 

1. f(x) is unbounded in a neighborhood of x = a or x = b. 

2 . The interval (a, b) itself is unbounded. 

In Case 1, we define a definite integral, 

A' 


/ O /*TV 

f(x)dx = lim J f(x)dx, 


if f(x) is bounded and integrable on every finite interval (a, X) for a < X < b. 
Similarly, if f(x) is bounded and integrable on every (X, b) for a < X < 6, we 
can define 


/ /W* = x lim „//(*)*■ 


A 

These definite integrals are called improper integrals. Straightforward ex- 
tensions of these results to Case 2 yields the other improper integrals: 

rA' 

' ix 


and 


/•OO /•TV 

/ f(x)dx = lim / f(x)da 

Ja Ja 

/ b pb 

f(x)dx = lim / f(x)dx 

-oo X—>oo J _ x 


' —X 

f°° dx 

Examples 1. The improper integral / has the value 1 since 


dx 


dx 


-TT = lim / — r. 


A— >oo 


dx 


2. The improper integral / has the value 1 since 


/o 


ix 


dx 

—= = lim 

\fx £ — ^+0 


dx 2 — \fe 

—= = lim — = 1. 

\fx s — ^+o 2 
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3.4.2 Convergence of an Improper Integral 

An improper integral over f{x) is said to converge if and only if the corre- 
sponding limit exists. Furthermore, it is said to converge absolutely if and 
only if the corresponding improper integral over |/(a:)| converges. (Keep in 
mind that absolute convergence implies convergence in the ordinary sense.) 
A convergent improper integral that does not converge absolutely is condi- 
tionally convergent. 

An improper integral f(x,y)dx converges uniformly on a set S of 
values of y if and only if the corresponding limit converges uniformly on S'. A 
relevant theorem is given below. 


4 Continuity theorem 

If f(x,y) is a continuous function, then f f(x,y)dx is a continuous 
function of y in every open interval where the integral converges uniformly. 


3.4.3 Principal Value Integral 

Suppose that a bounded or unbounded open or closed interval, (a, b) or [a, b], 
contains a discrete set of points x = ci, C2, • • • , such that f(x) is unbounded in 
a neighborhood of x = c, (i = 1, 2, • • • ). Then, the integral / f(x)dx may be 
defined as a sum of improper integrals, introduced in the previous subsection; 
i.e., 


no nc r-A-2 

/ f(x)dx = lim / f(x)dx + lim / f(x)dx 

Ja Xi—>a+OJ Xl x 2 ^b-oJ c 

nb pXi pb 

/ f(x)dx = lim / f(x)dx + lim / f(x)dx 

Ja x^c-o J a x 2 ^c+oJ X2 

/ oo pc rX <2 

f{x)dx= lim / f(x)dx+ lim / f{x)dx 
-oo Xi^oo J_ x X 2 —>oo J c 


' — OO 

if the limits exists. 


(a < c < 6), 
(a < c < b), 


(3.32) 


(3.33) 

(3.34) 


Even though the integrals (3.32), (3.33) and (3.34) do not exist, the limits 
of integrals 


[X 

lim / f(x)dx and 

*->0° J- X 


lim 

< 5— >0 


(*c— S 


f(x)dx 


f C+5 


f{x)da 
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may exist. If any of these limits exist, the corresponding integral, (3.32), (3.33) 
or (3.34), is necessarily equal to its principal value integral (see Sect. 9.4.1). 


3.4.4 Conditions for Convergence 


In what follows we give the convergence criteria for improper integrals of the 
form 


f(x)da 


and 


J f( x ) dx = X ^£_ 0 J f( x ) d: 


We assume that f(x) is bounded and integrable on every bounded interval 
(a, X) that does not contain the upper limit of integration. 


4 Cauchy’s test (= necessary and sufficient conditions for conver- 
gence): 

The improper integral f f(x)dx converges if and only if for every pos- 
itive real number e, there exists a real number M > a such that 


X 2 > Xi > M 



f(x)dx 


< £. 


Similarly, [ *' f(x)dx converges if and only if for every positive e, there 
exists a positive 5 <b — a such that 


b-X 2 <b-X 1 <6 



f{x)dx 


< £. 


Necessary and sufficient conditions for an improper integral to converge uni- 
formly are stated below. 


4 Weierstrass test 

The improper integral / f(x,y)dx [or f f(x,y)dx\ converges uni- 
formly and absolutely on every set S of values of y such that | f(x, y) \ < g(x) 
on the interval of integration, where g{x) is a real comparison function 
whose integral g{x)dx [or g(x)dx, respectively] converges. 
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Exercises 

1. Show that the integral j 

Solution: We have 

r A ■ 

/ sini 


sm a; 


dx converges. 


dx < 


x 


— COS X 


iA 


cosx 


so that 


r A ■ 
sin x 


dx 


1 1 dx (l 1 

<~ + ^+ = -+ T 


A 


A 


dx, 


- - I ) = 2 tt. 


7T A 


2. Show that 


This completes the proof. £ 
dx diverges. 


POO 

sinx 

L 

X 


Solution: It follows that 

/■(«+ 1)71 


sini 


dx = 


Sill x 


dx > 


l Q nir + x (n+ 1) 


sin xdx 


f*n-\- 2 


> 


dx 


(n+l)n 7 t J n+1 x 


Hence, for n > 1 we have 


r*nn 

sinx 

L 

X 


, 2 f n+1 dx 1. , ,, 

dx > — — = — log(?t+ 11 — » 00 , (n — > 00 ). A 

7T In X 7 r 


3. Suppose that /(a;) is continuous within an interval (a, 6] and diverges at 

f b 

x = a Prove that / f{x)dx converges if (x — a) p \f(x)\ is bounded on the 
J a 

interval for 0 < p < 1. 

Solution: We assume that there is an appropriate positive num- 

ber M such that 

(x — a) p \f(x)\ < M for all x € (a, b\. 

Then we obtain 

dx 


c x- P y- p ] b 


a+£ 


f \f(x)\dx < M [ = M . 

a+e Ja+e {x - a) p [ 1 -p 

= ^[(b- °) 1_p - ^- p ] < - a y- p 

i —p i — p 

(since 1 — p > 0) . 


(3.35) 



70 


3 Real Functions 


Note that the integral on the left-hand side of (3.35) is mono- 
tonically increasing with decreasing e, since |/(a;)| > 0 over the 
integration interval. Yet it is bounded from above, as proved in 
(3.35). Hence, we conclude that the given integral is convergent 
(absolutely). £ 


4. Suppose that f(x ) is continuous within [a,oo) and that x p \f(x)\ is 

,b 

bounded there for p > 1. Show that the integral / f{x)dx converges. 

J a 

Solution: It follows from hypothesis that there is a positive 

number M such that 


x p \f(x)\ < M for all x > a. 


Hence, we have for any X > a, 



\f(x)\dx < M 


f x dx —M 

1 

Ja XP p- 1 

xP- 1 _ 


M 1 

p — 1 a p_1 ’ 


which completes the proof. A 
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Hilbert Spaces 


Abstract A Hilbert space is an abstract vector space with the following two prop- 
erties: the inner product property (Sect. 4.1.3), which determines the geometry of 
the vector space, and the completeness property (Sect. 4.1.6), which guarantees the 
self-consistency of the space. Most of the mathematical topics covered in this volume 
are based on Hilbert spaces. In particular, L p spaces and l p spaces (Sect. 4.3), which 
are specific classes of Hilbert spaces, are crucial for the formulation of the theories 
of orthonormal polynomials, Lebesgue integrals, Fourier analyses, and others, as we 
discuss in subsequent chapters. 


4.1 Hilbert Spaces 

4.1.1 Introduction 

This section provides a framework for an understanding of Hilbert spaces. 
Plainly speaking, Hilbert spaces are the generalization of familiar finite- 
dimensional spaces to the infinite-dimensional case. In fact, the geometric 
structure of Hilbert spaces is very similar to that of ordinary Euclidean geom- 
etry. This analogy comes from the fact that the concept of orthogonality can 
be introduced in any Hilbert space so that the familiar Pythagorean theorem 
holds for elements involved in the space. Moreover, owing to its generality, 
a large number of problems in physics and engineering can be successfully 
treated with a geometric point of view in Hilbert spaces. 

As we shall see later, Hilbert spaces are defined as a specific class of vector 
spaces endowed with the following two properties: inner product and com- 
pleteness. The former property leads to a rich geometric structure and the 
latter enables us to describe an element in the space in terms of a set of or- 
thonormal bases. These facts result in the possibility of establishing a wide 
variety of complete orthonormal sets of functions in Hilbert spaces; 
we discussed this point in detail in Sects. 5.1 and 5.2. For a better under- 
standing of subsequent discussions, we provide all necessary definitions in 
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this section, and then describe several important consequences relevant to an 
understanding of the nature of Hilbert spaces. 

4.1.2 Abstract Vector Spaces 

In order to make this text self-contained, we first give a brief summary of the 
definition of vector spaces. A more precise description of vector spaces and 
some related matters will be provided in Sect. 4.2.1. 

4 Vector spaces: 

A vector space V is a collection of elements called vectors, which we 
denote by x, y, • • • , that satisfy the following postulates: 

1. There exists an operation (+) on the vectors x and y such that x + y = 
y + x, where the resultant quantity y + x also must be a vector. 

2. There exists an identity vector (denoted by 0) that yields a; + 0 = x. 

3. For every x £ V, there exists a vector ax £ V in which a is an arbitrary 
scalar (real and complex). In addition, 

a(/3x) = ( a(3)x , l(x) = x for all x, 

a(x + y) = ax + ay, (a + /3)x = ax + /3x. 

Emphasis is placed on the fact that vector spaces are not limited to a set 
of geometric arrows embedded in a Euclidean space (see Sects. 4.1.3 and 
19.2.3); rather, they are general mathematical systems that have a specific 
algebraic structure. Several examples of such abstract vector spaces are given 
below. 

Examples 1. The set of all n-tuples of complex numbers denoted by 

* = (?li ^2) i Cri) 

forms a vector space if the addition of vectors and the multiplication of a 
vector by a scalar are defined by 

X + y = (£l,&,--- ,£n) + {Vl,V2,- ■■ ,Vn) 

= (£l+??l,6+??2,--- ,£n + ?7n), 
aX 0:(£l, ^2) • • • i £n) (*^£l ) 0^2 ) * * * ? • 

2. The set of all complex numbers {^} forms a complex vector space (see 
Sect. 4.2.1), where z\ + and az are interpreted as ordinary complex 
numerical addition and multiplication, 

3. The set of all polynomials in a real variable x, constituting the set 
{1, x, x 2 , x 3 , ■ ■ ■ }, with complex coefficients is a complex vector space if 
vector addition and scalar multiplication are the ordinary addition of two 
polynomials and the multiplication of a polynomial by a complex number, 
respectively. 
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4.1.3 Inner Product 

The structure of a vector space is enormously enriched by introducing the 
concept of inner product, which enables us to define the length of a vector 
in a given vector space or the angle between the two vectors involved. 

6 Inner product: 

An inner product is a scalar- valued function of the ordered pair of vectors 
x and y such that 

1. ( x,y ) = (y,x)*. 

2. (ax + f3y , z) = a* ( x , z) + (3* (y, z), where a and fi are certain complex 
numbers. 

3. (x, x) > 0 for any x\ (x, x) = 0 if and only if x = 0. 

Here, the asterisk (*) indicates that one is to take the complex conjugate. 


Remark. Vector spaces endowed with an inner product are called inner prod- 
uct spaces. In particular, a real inner product space is called a Euclidean 
space and a complex inner product space is called a unitary space. 

The algebraic properties 1 and 2 are in principle the same as those governing 
the scalar product in ordinary vector algebra in a real vector space. The 
only property that is not obvious is that in a complex space, the inner product 
is not linear, but rather conjugate linear with respect to the first factor, i.e., 

(ax,y) = a*(x, y). 

Examples 1. The simplest, but an important, example of an inner product 
space is the space, denoted by C, that consists of a set of complex numbers 
{zi, Z 2 , ■ ■ ■ , z n }. For two vectors x = (£i, £ 2 , • ■ ■ £«) and y = (r?i, r? 2 , • • • y n ) 
on C, the inner product is defined by 


0 , 2 /) = '52&Vi- 

i - 1 

2. Suppose that f(x) and g(x) are polynomials in the complex vector space 
defined on the closed interval x £ [0,1]. They then constitute an inner 
product space under the inner product defined by 

( f,9)= f f(x)*g(x)w(x)dx, 

Jo 

where w(x) is a weight function. The weight function becomes impor- 
tant when defining the inner product of polynomials, which is treated in 
Chap. 5. 
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3. If x = [£i, £21 £ 3 )^ 4 ] and y = [771 , 772 , 773 , ^ 74 ] are column four-vectors 
having real-valued elements, then the quantity 

(*, y) = + 6?72 + 6% - Urn (4.1) 

satisfies requirements 1 and 2 for an inner product, but not 3 since the 
quantity ( x , x) is not positive-definite. Thus the entity (4.1) is not an inner 
product, but it plays an important role in the theory of special relativity. 

For a complex vector space, the inner product is not symmetrical as it is in 
a real vector space. That is, (x,y) ^ (y,x) but rather (x,y) = (y,x)*. This 
implies that (x, x ) is real for every x , so we can define the length of the vector 
x by 

11*11 = (x,x) 1/2 . 

Since (x,x) > 0, ||cc|| is always nonnegative and real. The quantity ||se|| is 
referred to as the norm of the vector x. Note also that 

||a£c|| = (ax, ax) 1 / 2 = [a* a(x,x)] 1 ^ 2 = |a| • ||a;||. 

Remark. Precisely speaking, the quantity ||£ej| introduced above is a special 
kind of norm that is associated with an inner product; in fact, the norm was 
originally a more general concept that was independent of the inner product 
(see Sect. 4.2.2). 


4.1.4 Geometry of Inner Product Spaces 

Once a vector space is endowed with an inner product, several important the- 
orems that can be easily interpreted in analogy with Euclidean geometry can 
be applied. The following three theorems characterize the geometric nature of 
inner product spaces (x ^ 0 and y / 0 are assumed; otherwise the theorems 
all become trivial) . 

4 Schwarz inequality: 

For any two elements x and y of an inner product space, we have 

I (*»!/) I < 11*11 M- (4-2) 

The equality holds if and only if x and y are linearly independent. 


Proof From the definition of the inner product, we have 


0 < (x + ay, x + ay) = (x, x) + a(x, y) + a*(y, x) + \a\ 2 (y, y). (4.3) 
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Now, set a = ~(x,y)/(y,y) and multiply by (y,y) to obtain 
0 < (x,x)(y,y) - \{x,y)\ 2 , 
which gives Schwarz’s inequality. 

Next, we prove the statement of the equality in (4.2). If x and y are linearly 
dependent, then y = ax for some complex number a so that we have 

\(x,y)\ = \{x,ax)\ = |a|(aj, a:) = |a|||a:|| ||x|| = ||x|| ||cur|| = ||sc|| ||y||. 

The converse is also true; let x and y be vectors such that \(x,y)\ = ||x|| ||y||, 
or equivalently, 

\(x,y)\ 2 = (x,y)(y,x) = (x,x)(y,y) = ||a;|| 2 ||y|| 2 . (4.4) 


Then we set 

II {y,y)x - (y,x)y || 2 

= ||y|| 4 ||*|| 2 + \(y,x)\ 2 \\yf - \\y\\ 2 (y,x)(x,y) ||y|| 2 (y, *)*(y, *) 

= 0, (4.5) 

where the postulate (4.4) and the relation (y, a:)* = (x, y) were used. The 
result (4.5) means that 


(y> y) x - ( x,y)y = 0, 

which clearly shows that x and y are linearly dependent, which completes the 
proof, ft 


4 Triangle inequality: 

For any two elements x and y of an inner product space, we have 


ll^ + yll < 11*11 + llyll- 


Proof Setting a = 1 in (4.3), we have 

II* + y|| 2 = (*, *) + (y, y) + 2Re(®, y) 

< (x,x) + (y,y) + 2|(x,y)| 

< || cc|| 2 + ||y|| 2 + 2||cc|| ||y || (by Schwarz's inequality) 

= (11*11 + l|y||) 2 , 


which proves the desired inequality. X 
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4 Parallelogram law: 

For any two elements x and y of an inner product space, we have 
\\x + y\\ 2 + \\x-y\\ 2 = 2 (||a:|| 2 + ||y|| 2 ) . 


Proof We have 

\\x + y || 2 = (x, x) + (x, y) + (y, x) + (y, y) 

= INI 2 + (x,y) + (y,x) + \\y\\ 2 . (4.6) 

Now replace y by — y to obtain 

II* - y || 2 = ll*ll 2 - (*,y) - (y,x) + ||y|| 2 . (4.7) 

By adding (4.6) and (4.7), we attain our objective. £ 


4.1.5 Orthogonality 

One of the most important consequences of having the inner product is being 
able to define the orthogonality of vectors. The orthogonality allows us to 
establish a set of orthonormal bases that span the inner product space in 
question, thus yielding a useful way to analyze both the nature of the space 
itself and the relation between the constituents involved in that space. 

4 Orthogonality: 

Two vectors x and y in an inner product space are called orthogonal 
if and only if (x, y) = 0. 

Notably, if (x,y) = 0, then (x,y) = (y,x)* = 0 so that (y,x) = 0 as well. 
Thus, the orthogonality is a symmetric relation, although the inner product 
is not symmetric. Note also that the zero vector 0 is orthogonal to every 
vector in the inner product space. 

A set of n vectors faq, aq, • • • x n } is called orthonormal if (aq, ar,) = Sij 
for all i and j, where 5^ is the Kronecker delta. That is, the orthonormality 
of a set of vectors means that each vector is orthogonal to all the others in 
the set and is normalized to unit length. 

It follows that any vector x may be normalized by dividing by its length to 
form the new vector x/||cc|| with unit length. An example of an orthonormal 
set of vectors is the set of three unit vectors, {e,} (i = 1, 2, 3), for the three- 
dimensional Cartesian space. 

The following theorem is important in various fields of mathematical 
physics. 
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6 Theorem: 

An orthonormal set is linearly independent. 

(Proof of the theorem is given in Exercise 1). Importantly, the above theorem 
suggests that any orthonormal set serves as a basis for an inner product space 
of interest (see Sect. 4.2.5). Below is another consequence of the orthonormal 
set of vectors; its proof is given in Exercise 2. 

Bessel inequality: 

If {x\, X 2 , ■ ■ ■ , x n } is a set of orthonormal vectors and x is any vector 
defined in the same inner product space, then 

imi 2 >En 2 ’ ( 4 - 8 ) 

i 

where r t = (x t . x). Furthermore, the vector x' = x — JA riXi is orthogonal 
to each xj. 


4.1.6 Completeness of Vector Spaces 

Having described features of inner product spaces, we turn now to an- 
other important concept relevant to the nature of Hilbert spaces, i.e., com- 
pleteness. When a vector space is finite dimensional, the completeness 
of an orthonormal set involved in the space may be characterized by the 
fact that it is not contained in any larger orthonormal set. (This is intu- 
itively understood by considering the Cartesian basis e, (i = 1,2,3) in a 
three-dimensional Euclidean space.) When considering an infinite-dimensional 
space, however, the completeness must be determined via the Cauchy cri- 
terion, which we discussed in Sect. 2.2. The following is a preliminary 
definition 


4k Cauchy sequence of vectors: 

A sequence {x\,X 2 ,- ■ of vectors is called a Cauchy sequence of 
vectors if for any positive e > 0, there exists an appropriate number N 
such that \\x m — £c ra || < £ for all m, n > N. 


In plain words, a sequence is a Cauchy sequence if the terms x m and x n in 
the sequence come closer and closer to each other as m, n — > oo 
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4 Convergence of a sequence of vectors: 

A sequence {x\, x%, ■ ■ ■ } is said to be convergent if there exists an ele- 
ment x such that \\x n — x\\ — > 0. 

4 Completeness of a vector space: 

If every Cauchy sequence in a space is convergent, we say that the space 

is complete. 


Remark. Here the norm ||x|| = (x, a;) 1 / 2 associated with an inner product is 
employed to define a Cauchy sequence, since we are focusing on inner product 
spaces. However, the concepts of Cauchy sequence and completeness both 
apply to more general vector spaces in which even a norm is unnecessary (see 
Sect. 4.1.6 for details). 


4.1.7 Several Examples of Hilbert Spaces 

Now we are ready to define Hilbert spaces. 

6 Hilbert space: 

If an inner product space is complete, it is called a Hilbert space. 


Examples 1. Column- vector spaces with n real and complex components, 
denoted by R n and C n , respectively, are finite-dimensional Hilbert spaces 
if endowed with an inner product (x,y) = Y^i=i x iyi- Completeness can 
be proved using the Bolzano- Weierstrass theorem (see Appendix A). 

2. Assume an infinite-dimensional vector x = {x\, X 2 , • • • ), where x. t is a real 
or complex number satisfying the condition 

OO 

y: \xi\ 2 < oo. 

i—\ 

Then, vector spaces spanned by a set of vectors {a:}, called f 2 spaces (see 
Sect. 4.3), are Hilbert spaces under the inner product 

OO 

(x,y) = ^2x*yi. 

i = 1 


Completeness will be proved in Sect. 4.3.1. 
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4. Assume a set of square-integrable functions /( x) expressed by 


r b 

/ \f{x)\ 2 dx < oo. 

J a 


Then, the collection of all square-integrable functions, called the L 2 space, 
is a Hilbert space endowed with the inner product 


( f,9 ) = [ f(x)*g(x)dx. 

J a 

Completeness will be proved given in Sect. 4.3.2. 


(4.9) 


5. Finally we show an example of an incomplete inner product space. Assume 
the following sequence of real- valued continuous functions {fi(x), f 2 (x ), • • • }, 
each of which is defined within the interval [0, 1]: 

{ 1, for 0 < x < g, 

1 - 2n (x - \) for \ < x < ^ (4.10) 

0 , ^ ^ + l < * < 1 - 

The graphs of f n {x) for n = 1,2,3 are given in Fig. 4.1. After some 
algebra, we obtain 


II fn(x) - fm{x) || 



fm ) 2 dx 


1/2 



as to, n — >oo 


(m > n). 



Fig. 4.1. The function f n {x) given in (4.10). The sequence {/«(*)} converges to a 
step function in the limit of n — > oo 
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Thus, {/„} is a Cauchy sequence owing to the inner product given by 
(4.9). However, this sequence converges to the limit function 


/ 0 ) 


1 if 0 < x < |, 
0 if | < x < 1, 


which is not continuous and, hence, is not an element of the original inner 
product space. Consequently, the sequence is not complete, and thus is 
not a Hilbert space. 


Exercises 

1. Show that an orthonormal set is linearly independent. 

Solution: Recall that a set of vectors {xi, x 2l ■ ■ ■ x n } is said to 

be linearly independent if and only if 


a.iXi =0 ctj = 0 for all i. 


i- 1 


Now suppose that a set {x\, x 2 , ■ ■ ■ , x n } is orthonormal and sat- 
isfies the relation JT onXi = 0. Then, for any j, the orthonormal 
condition ( x.^Xj ) = Sij results in 


0 — | Xj , ^ ^ OL{Xi j — ^ ^ j 5 ^ ^ — 


an 


2= 1 


2=1 


Therefore, the set is linearly independent. X 


2. Show the Bessel inequality for x given by 4.8 and the orthogonality of the 
vector x' = x — r i x i to each Xj. 

Solution: We consider the inequality 

In n 

o < ii a;, ii 2 = ( x ' i x ') = [ * — fiXi, x — rji 


2=1 


i=i 


= (*,*) - ^2r*(x u x) -J2 r j( x ^ x j) + E r *i r A x i, x i) 

3= 1 i»i =1 


2=1 


-£m 2 -£m 2 + £i 


2=1 


-Ei 

2 = 1 


3 = 1 


j=l 
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Thus we have ||:r|| 2 > JA |r.j| 2 . The second part of the theorem is 
proven by 

(*'> x o) = ( x , x j) ~ r *i ( x ^ x i) = r *j ~ r j = °- * 


4.2 Hierarchical Structure of Vector Spaces 

4.2.1 Precise Definitions of Vector Spaces 

In this section, we look at the hierarchical structure of abstract vector spaces. 
We will find that the Hilbert spaces that we have considered form a very 
limited, special class of general vector spaces under strict conditions. We begin 
with an exact definition of vector spaces. 

Vector spaces: 

A vector space V is a set of elements x (called vectors) that satisfy the 
following sets of axioms: 

1. V is a commutative group under addition: 

(i) x+y=y+x£V forany x, y £ V (closedness). 

(ii) x + (y + z) = (x + y) + z (associativity). 

(iii) There exists an addition identity, the zero vector 0, for every x £ V 
such that x + 0 = x. 

(iv) There exists an additive inverse — x for every x £ V such that 
x + (— x ) = 0 . 

2. V satisfies the following additional axioms with respect to a number 
field F , whose elements a are called scalars: 

(i) V is closed under scalar multiplication: 

ax £ V for arbitrary x £ V and a £ F. 

(ii) Scalar multiplication is distributive with respect to elements of both 
V and F: 

a(x + y) = ax + ay, (a + (3)x = ax + fix. 

(iii) Scalar multiplication is associative: a{(5x) = /3(ax). 

(iv) Multiplication with the zero scalar 0 £ F gives the zero vector such 
that Ox = 0 £ V . 

(v) The unit scalar 1 £ F has the property that lx = x. 

In these definitions, F is either the set of real numbers, R, or the set of 
complex numbers, C . A vector space over R is called a real vector space. 
If F = C, then V is a complex vector space. 
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4.2.2 Metric Space 

Once a vector space is endowed with the concept of a distance between the 
elements, say, x £ V and y £ V, it is called a metric space. 

Metric space: 

Assume a vector space V. A metric space is the pair ( V , p) in which the 
function p : V x V — » R, called the distance function, is a single- valued, 
nonnegative, real function that satisfies: 

1. p(x , y) = 0 if and only if x = y. 

2. p(x,y) = p{y,x). 

3. p(x, y) < p{x , z) + p{z , y) for any z £ V. 


Remark. Strictly speaking, the above is called a metric vector space as a subset 
of more general metric spaces. The latter consists of a pair ( U,p ), where U is 
a set of points (not necessarily vectors) and p is a distance function. If U is a 
vector space V, then (V, p) is called a metric vector space. 


Examples 1. If we set 

0 if x = y, 

1 if x y 

for arbitrary x, y £ V, we obtain a metric space. 


p(x,y) = 


2. The set of real numbers R with the distance function p(x,y) = \x — y\ 
forms a metric space. 


3. The set of ordered n-tuples of real numbers x = {x\, # 2 , • ■ ■ , x n ) with the 
distance function 

I 1/2 

\2 


p{x,y) = 


( x i ~ Vi ) 2 


,i—l 


is a metric space. This is in fact the Euclidean n-space, denoted by R n . 


4. Consider again the set of ordered n-tuples of real numbers x = (x\, 
X 2 i • • * ,x n ) with an alternative distance function: 

p(x, y) = max [|x» - j/»| ; 1 < i < n] . 

This also serves as a metric space. The validity of Axioms 1 3 mentioned 
above is obvious. 

Comparison between Examples 3 and 4 tells us that the same vector space 
V can be metrizecl in different ways. These two examples call attention 
to the importance of distinguishing a metric space (V, p) from the vector 
space V. 
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4.2.3 Normed Spaces 

A metric space is said to be normed if for each element x € V there is a 
corresponding nonnegative number j|as|| , which is called the norm of x. 


4 Normed space: 

A metric space equipped with a norm is called a normed space. The 
norm is defined as a real- valued function (denoted by || ||) on a vector space 
V, which satisfies 

1. || Aas|| = |A|||*| for all A € F and x G X. 

2. \\x + y\\ < ||a:|| + ||y||. 

3. ||a:|| = 0 if and only if * = 0. 


Obviously, a normed space is a metric space under the definition of the dis- 
tance p(x,y) = ||* - y || . 


Examples 1 . The space consisting of all n-tuples of real numbers: * = 
(*i, *2) • • • i x n ) in which the norm is defined by 


11*11 



1/2 


is a normed space. 

2. The space above can be normed by a more general form: 


11*11 



ip > !)• 


This norm is referred to as a p-norm of the vector *. 

3. We further obtain an alternative normed space if we set the norm of the 
vector * = (*i, * 2 ) • ■ ■ , x n ) equal to the max {|*fc|; 1 < k < n}. 

4. The collection of all continuous functions defined on the closed interval 
[a, b] in which 

11/0)11 = max{|/(*)| : x £ [a, 6]} 

is a normed space. 

5. The space consisting of all sequences * = (aq, X 2 , ■ • • , x n ) of real numbers 
that satisfy the condition limj^oo x n = 0 is a normed space if we set 


||*|| = max{|xfc| : 1 < n < oo}. 
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4.2.4 Subspaces of a Normed Space 

A class of normed spaces involves the following two subclasses: one endowed 
with completeness and the other with the inner product. The normed spaces 
of the former class, i.e. , a class of complete normed vector spaces, are called 

Banach spaces. 

4 Banach space: 

If a normed space is complete, it is called a Banach space. 


Here, the completeness of a space implies that every Cauchy sequence in the 
space is convergent. Refer to the arguments in Sect. 4.1.6 for details. 

I Remark. Every finite-dimensional normed space is a Banach space, since it is 
necessarily complete. 


Examples 1. Suppose that a set of infinite-dimensional vectors x = (x\,X2, ■ ■ ■ , 
x n ,- ■ ■ ) satisfies the condition 

OO 

\Xi\ P < 00, 0>1). 

i=l 


Then, this set is a Banach space, called an £ p space, under the p-norin 
defined by 

/ oo \ 1 /p 


HE 


Xi\ 


(4.11) 


vi=l 


The proof of its completeness is given in Sect. 4.3.1. 
2. Assume a set of functions /( x) expressed by 


\f(x)\ p dx < oo. 


Then, this set constitutes a specific class of Banach spaces, called an L p 
spaces, under the p-norm: 


\ !/P 

\f(x)\ p dx\ . (4.12) 

Completeness is proved in Sect. 4.3.2. 

Now we focus on the counterpart, i.e., a noncompleted normed space endowed 
with an inner product known as a pre-Hilbert space. 





4.2 Hierarchical Structure of Vector Spaces 


87 


4 Pre-Hilbert space: 

If a normed space is equipped with an inner product (not necessarily 
complete), then it is called a pre-Hilbert space. 


Finally, we are at a point at which we can appreciate the definition of Hilbert 
spaces. They are defined as the intersection between Banach spaces and pre- 
Hilbert spaces as stated below 


4 Hilbert space: 

A complete pre-Hilbert space, i.e. , a complete normed space endowed 
with an inner product is called a Hilbert space. 


Examples The £ p spaces and L p spaces with p = 2, known as the £ 2 spaces 
and L 2 spaces, are Hilbert spaces. The inner product of each space, respec- 
tively, is given by 


OO 

(*, y) = ^2 Xiyi and (f,g) 

i = 1 


f*(x)g(x)dx. 


(4.13) 


Remark. Clearly the quantities (a:,®) 1 / 2 and (/, f) 1 ^ 2 , defined through the 
inner products (4.13), are special cases of the p-norm given by (4.11) and 
(4.12), respectively, with p = 2. In fact, for the £ 2 and L 2 spaces, the inner 
products are defined such that 

(*,*) = IMI 2 and (/,/) = ||/|| 2 . 

However, for £ p and L p spaces with p ^ 2, we cannot introduce inner products 
as 

(*,*) = (IMIpf and (/,/) = (||/||p) P 

because unless p = 2 the p-norm violates the parallelogram law. Accordingly, 
among the family of i p and L p , only the spaces £ 2 and L 2 can be Hilbert 
spaces because they have an inner product. 


4.2.5 Basis of a Vector Space: Revisited 

For use in Sect. 4.2.6, we briefly review the definition of a basis in a finite- 
dimensional vector space and related matters. 
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6 Linearly independent vector: 


A finite set of vectors, say, ei,e 2 ,- 

■ • , e n is linearly independent if 

and only if 


Cie i= 1 = 0 ^ 

i 

c, = 0 for all i. (4-14) 


This definition applies to infinite sets of vectors ei, e^, ■ ■ ■ if the vector space 
under consideration admits a definition of convergence (see Sect. 4.2.6 for 
details) . 


A Basis of a vector space: 

A basis of the vector space V is a set of linearly independent vectors 
{e^} of V such that every vector x of V can be expressed as 

n 

x = '^2a i e i . (4-15) 

i - 1 

Here, the numbers ,a n are coordinates of the vector x with 

respect to the basis, and they are uniquely determined owing to the linear 
independence property. 

Therefore, every set of n linearly independent vectors is a basis in a finite- 
dimensional vector space spanned by n vectors. The number n is called the di- 
mension of the vector space. Obviously, an infinite-dimensional vector space 
does not admit a finite basis, which is why it is called infinite-dimensional. 

4.2.6 Orthogonal Bases in Hilbert Spaces 

For any vector space (finite- or infinite-dimensional), a set of orthogonal vec- 
tors {x n } is called an orthogonal basis if it is complete. Similarly, a com- 
plete orthogonal set of vectors is called an orthonormal basis if the norm 
||a; n || = 1 for all n. It is convenient to use orthonormal bases in studying 
Hilbert spaces, since any vector in the space can be decomposed into a linear 
combination of orthonormal bases. However, when we choose some basis for 
an infinite-dimensional space, some care must be taken to examine its com- 
pleteness property; i.e. , an infinite sum of vectors in a vector space may or 
may not be convergent to the identical vector space. 

To examine this point, let us consider an infinite set {e,} (i = 1, 2, • • • ) of 
orthonormal vectors all belonging to a Hilbert space V. We take any vector 
x € V and form the set of vectors 
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where the complex number Cj is the inner product of e, and x expressed by 

Ci (e & , *) • 

For the pair of vectors x and x n , the Schwarz inequality (4.2) gives 

\(x, Xn )\ 2 < ll^ll 2 |K|| 2 = ||x|| 2 ^ | Cl | 2 j . (4.17) 

On the other hand, taking the inner product of (4.16) with x yields 

n n 

(x, x n ) = ^Ci (x, e,;) = ^2 I Ci 1 2 . (4.18) 

i= 1 i = 1 

From (4.17) and (4.18), we have 

n 

J2 n 2 < 1*1 2 • 

»= i 

This conclusion is true for arbitrarily large n and can be stated as shown 
below. 


4 Bessel inequality: 

Let {e,} (i = 1,2, ■■■) be an infinite set of orthonormal vectors in a 
Hilbert space V. Then for any x GV with Cj = (e.;, x), we have 

OO 

J2 i Ci i 2 - ini 2 ’ 

i—l 

which is known as the Bessel inequality. 

The Bessel inequality shows that the limiting vector 

n oo 

lim Y c^e, = V de t (4.19) 

n—> oo L — ' z ^ 

2=1 2=1 

has a finite norm, which means that the vector (4.19) is convergent. However, 
we still do not know whether it converges to x. To make such a statement, the 
set {ej} should be equipped with the completeness property defined below. 

Complete orthonormal vectors: 

An infinite set of orthonormal vectors {e.^} in a Hilbert space V is called 
complete if the only vector in V that is orthogonal to all the e,; is the zero 
vector. 
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The following is an immediate consequence of the above statement. 

4 Parseval identity: 

Let {ei} be an infinite set of orthonormal vectors in a Hilbert space V. 
Then for any x £ V, 

{e;} is complete 

OO 

\\x\\ 2 = ^2\ci\ 2 with Cj = (e.j, x). (4.20) 

i= 1 


Proof Suppose that the set {e,} is complete and consider the vector defined 

by 

OO 

y = x - 

i = 1 

where x E V and Ci = (e*, x). It follows that for any e^, 

OO OO 

(ej,y) = (ej,x) ( e H e i) = c 0 ~ E Ci6 P = °' ( 4 - 21 ) 

t= 1 i — 1 

In view of the definition of the completeness of {e.^}, (4.21) means that y is 
the zero vector. Hence, we have 


OO 

X — ^ ) Ci G i , 
2=1 


which implies 

oo 

Ml a = X>| 2 . 

2=1 

We now consider the converse. Suppose x to be orthogonal to all the {e^, 
which means 

( ei,x ) = Ci = 0 for all i. (4.22) 

It follows from (4.20) to (4.22) that || cc|| 2 = 0, which in turn gives x = 
0 , because only the zero vector has a zero vector. This completes the 
proof. £ 


We close this section by providing precise terminology for the basis of a Hilbert 
space. 
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4 Basis of a Hilbert space: 

A complete orthonormal set {e,;} {i = 1, 2, ■ ■ • ) in a Hilbert space V is 
called a basis of V. 


Remark. 

1. The concept of completeness of an orthonormal set of vectors is distinct 
from the concept of completeness of the Hilbert space, but they are mu- 
tually related. 

2. In order to define generalized Fourier coefficients c,; = (e.i,x) for 
x e V (see Sect. 4.3.4), it suffices for the set {ei} to be only orthonormal, 
nor necessarily complete. 


4.3 Hilbert Spaces of £ 2 and L 2 
4.3.1 Completeness of the £ 2 Spaces 

In this subsection, we examine the completeness property of the space £ 2 on 
the field F (here F = R or C) . As already noted, the completeness of a given 
vector space V is characterized by the fact that every Cauchy sequence (x n ) 
involved in the space converges to an element x £ V such that lim n _ >00 \\x — 
a: n || = 0. Hence, to prove the completeness of the £ 2 space, we show in turn 
that (1) every Cauchy sequence (x n ) in the £ 2 space converges to a limit x , 
and (2) the limit x belongs to £ 2 . 

We consider Statement (1). Assume a set of infinite-dimensional vectors 

T (n) _ f T (n) T (n) . . . \ 

wherein x[ n ^ £ F, and let the sequence of vectors x^ 2 \ ■ ■ ■ } be a Cauchy 

sequence in the sense of the norm 


x 


= £ 


oo \ !/2 

\Xi \ 2 < OO. 


vi=l 


Then, for any e > 0, there exists an integer N such that 


m,n > N 


x ( m ) — ajC™) 


- £ 


1/2 


2b (m) - x[ H) 


vi=l 


< e. (4.23) 
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This implies that 


x { r ] - x[ n) 


< £ 


(4.24) 


for every i and every m,n > N. Furthermore, since (4.23) is true in the limit 
m — > oo, we find 

x — x || < e (4-25) 


for arbitrary n > N. The inequalities (4.24) and (4.25) mean that con- 
verges to the limiting vector expressed by x = (aq, X 2 , •••), in which the 
component Xi £ F is defined by 


= lim 


„(«) 


(4.26) 


(That the limit (4.26) belongs to F is guaranteed by the completeness of F.) 

The remaining task is to show that the limiting vector x belongs to the 
original space i 2 . By the triangle inequality, we have 


X - *("> + 

< 

X - X ^ 

+ 

X 







Hence, for every n > N and for every e > 0, we obtain 


M < e + 


X 


(n) 


As the Cauchy sequence 
than 


(a^ 1 ), x^ 2 \ ■ ■ ■ ) is bounded, ||x|| cannot be greater 


£ + lim sup 


and is therefore finite. This implies that the limit vector x belongs to £ 2 (F). 
Consequently, we have proven that the space £ 2 (F) is complete. 


Remark. Among the various kinds of Hilbert spaces, the space £ 2 has a sig- 
nificant importance in mathematical physics, mainly because it provides the 
groundwork for the theory of quantum mechanics. In fact, any element x of 
the space £ 2 satisfying the normalized conditions ||x|| = \ x i\ 2 = 4 works 

as a possible state vector of quantum systems. In the Heisenberg formula- 
tion of quantum mechanics, the infinite-dimensional matrices corresponding 
to physical observables act on these state vectors. 


4.3.2 Completeness of the L 2 Spaces 

We next consider another important class of Hilbert spaces, called L 2 spaces, 
which are spanned by square-integrable functions {f n (x)} on a closed interval, 
say [a, b\. To prove the completeness of the L 2 space, we show that every 
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Cauchy sequence {/„,} in the L 2 space converges to a limit function /( x), and 
then verify that the / belongs to L 2 . 

Let - ■ ■ } be a Cauchy sequence in L 2 . Then for any small 

£ > 0, we can find an integer N such that 


m.n > N 


|| fn ~ /mil = J \fn{x) - f m {x )\ 2 dx < £. 


Then, it is always possible to find an integer n\ such that 
n>n i => \\f ni (x) ~ fn{x)\\ < 

By mathematical induction, after finding rik-i > rik- 2 , we find Uk > Uk-i 
such that 

n >n k => II fn(x) - fn k {x)\\ < Q 
In this way, we obtain a sequence (J' nk ) that is a subsequence such that 


\\fn k+1 ( x ) ~ fn k {x ) II < ( ^ ) for k = 1,2, -- 


or equivalently, 


OO OO / 1 \ ^ 

II fm II + II / n k+i - fn k II < ll/m II + ( n ) = ll/mll + 1 = A, 

fc = 1 fc = 1 ' ' 

where A is a finite constant. Let 

9k = \fm | + | fn 2 ~ fm | + b l/rifc+i — /rife I (fe = 1,2,---). 

Then, by the Minkowski inequality, we have 

[ [g k {x)] 2 dx= I [\fm\ + \fn 2 - fm\ + --- + \fn k+1 ~fn k \] 2 dx 
J a J a 

( k \ 2 

ll/n 1 ll+E||/n i+1 -/r ! ,||J < A 2 < OO. (4.27) 

Let g(x) = limgk(x). Then [ g(x)] 2 = lim[( 7 fc(:r)] , and 

pb pb pb 

/ [g{x)] 2 dx = / lim [gk{x)] 2 dx = lim / [gk{x)] 2 dx. (4.28) 
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[See the remark below for the interchangeability of the limit and integral signs 
in (4.28).] It follows from (4.27) and (4.28) that 

f b 

/ [g(x)] 2 dx < 00 , 


or equivalently, 


J ^l/wi I + \fn k+1 fn k 

This implies that the infinite sum 

OO 

II /m II + ||/n fc +i — fni 


dx < oo. 


(4.29) 


k— 1 


converges to a function, denoted by / £ L 2 , in the sense of the norm in L 2 . 

We next show that the limit function /(x) expressed by (4.29) is an element 
of L 2 such as 

\\fn(x) - f(x)\\ -> 0 (n — > oo). (4.30) 

We first note that 

OO 

f(x) - f nj (x) = [f nk+1 - f nk (x)] . 

k—j 


It follows that 


/ fnj — ^2 ||/"fc+i f n k < ^2 ( 9 ) — 2J _1 ’ 


so we have 
Observe that 


k—j k=j 


lim \\f ~ fnj\\ =0. 

J^OO " " 


\\fn - /|| < \\fn ~ fn k \\ + \\fn k - f\\ , 

where ||/„ — f nk || — > 0 as n — > oo and k — » oo; thus 

lim || f n - /|| = 0, 

n—>oo 

which shows that the Cauchy sequence (/„) converges to f £ L 2 . 


Remark. The interchangeability of limit and integral signs in (4.28) is justified 
by the following three facts: 

(i) The sequence ([fffe(x)] 2 ) is a sequence of square-integrable functions in 
[a,b], 

(ii) [(/fc(x)] 2 > 0 for all k, and 

(iii) The integral f b [gk] 2 dx for each k has a common bound A 2 as shown in 
(4.27). The proof of this point is based on the theory of the Lebesgue 
integral, which we discuss in Chap. 6. 
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4.3.3 Mean Convergence 


Before proceeding, comments on a new class of convergence that is relevant 
to the argument on the completeness of the L 2 space are in place. Observe 
that the expression (4.30) is rephrased in the following sentence: For any small 
£ > 0, it is possible to find N such that 


n>N => \\f(x) — f n {x)\\ < e. (4.31) 


Hence, we can say that the infinite sequence (/„) converges to f{x) in the norm 
of the L 2 space. Convergence of the type (4.31) is called the convergence 
in the mean or the mean convergence, which is inherently different from 
the uniform convergence and the pointwise convergence. The point is the fact 
that in the mean convergence, the quantitative deviation between f n (x) and 
f{x) is measured not by the difference f(x ) — f n (x), but by the norm in the 
L 2 space based on the integration procedure: 


11 / 0*0 - fn 0 * 0 || 


1/0*0 - fn{x)dx\ dx 


1/2 


Hence, when f(x) is convergent in the mean to f n (x) on the interval [a, b\, 
there may exist a finite number of isolated points such that f(x) ^ f n {x). 
Obviously, this situation is not allowed in cases of uniform or pointwise con- 
vergence. 


4.3.4 Generalized Fourier Coefficients 

Having clarified the completeness property of the two specific Hilbert spaces, 
£ 2 and L 2 , we introduce two important concepts: generalized Fourier co- 
efficients and generalized Fourier series. We shall see that they play a 
crucial role in revealing the close relationship between the two distinct Hilbert 
spaces £ 2 and L 2 . 

Generalized Fourier coefficients: 

Suppose that a set of square-integrable functions {</>,} is orthonormal 
(not necessarily complete) in the norm of the L 2 space. Then, the numbers 

Cfc = (/, <t>k) (4.32) 

are called the Fourier coefficients of the function / £ L 2 relative to the 
orthonormal set {(pi}, and the series 

OO 

y Ck&k (4.33) 

k=l 

is called the Fourier series of / with respect to the set {(pi}. 
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Remark. 

1. In general, the Fourier series shown in (4.33) may or may not be con- 
vergent; its convergence property is determined by the features of the 
functions / and the associated orthonormal set of functions {<t>k}- 

2. Some readers may be familiar with the Fourier series associated with 
trigonometric functions or imaginary exponentials. Notably, however, the 
concepts of Fourier series and Fourier coefficients introduced above are 
more general concepts than those associated with trigonometric series. 

The importance of the Fourier coefficients (4.32) becomes apparent when we 
see that they consist of the £ 2 space. In fact, since is the inner product of 
/ and ( j>k , it yields the Bessel inequality in terms of Ck and /: 

OO 

Em 2 ^ ii/ii- ( 4 - 34 ) 

fc=i 

From the hypothesis of / £ L 2 , the norm ||/|| remains finite. Hence, the 
inequality (4.34) ensures the convergence of the infinite series YlkLi l c fc| 2 > 
which consists of the Fourier coefficients defined by (4.32). This convergence 
means that the sequence of Fourier coefficients {ck} is an element of the space 
£ , whichever orthonormal set of functions <f>k(x ) we choose. In this context, 
the two elements / € L 2 and c = (ci,C 2 ,---) € £ 2 are connected via the 
Fourier coefficient (4.32). 

4.3.5 Riesz— Fisher Theorem 

Recall that every Fourier coefficient satisfies the Bessel inequality (4.34). 
Hence, in order for a given set of complex numbers (c*) to constitute the 
Fourier coefficients of a function / £ L 2 , it is necessary that the series 

OO 

E i Cfc i 2 

fc= i 

converge. As a matter of fact, this condition is not only necessary, but also 
sufficient as stated in the theorem below. 

4 Riesz— Fisher theorem: 

Given any set of complex numbers (cj) such that 

OO 

5>*| 2 <oo, (4.35) 

k = i 


there exists a function / £ L 2 such that 
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Ck = (f,<Pk) and ^ | Cfe | 2 = ll/ll 2 , (4.36) 

fc= l 

where {(pi} is a complete orthonormal set. 


Proof Set linear combinations of <pk(x ) as 

n 

fn(x) = ^ ( 4 - 37 ) 

k=l 

where the are arbitrary complex numbers satisfying condition (4.35). Then, 
for a given integer p > 1, we obtain 

n+p 

|| fn+p /n|| = IlCn+l^n+l + ' ' ' + C n +p(p n +p || = ^ ) |Cfe| . (4.38) 

fe=n+l 

Let p = 1 and n — > oo. Then, from condition (4.35), we have 
||/n+l - /n|| = |c„+l| 2 -> 0 (n -> OO). 

This tells us that the infinite sequence {f n } defined by (4.37) associated with a 
given set of complex numbers {ci} always converges in the mean to a function 

/a 2 . 

Our remaining task is to show that this limit function f(x ) satisfies con- 
dition (4.36), so we consider the inner product 

(/, </>i) = (fn, (pi) + (/ - fn, Pi), (4.39) 

where we assume n > i. It follows from (4.37) that the first term on the 
right-hand side is equal to c,;. The second term vanishes as n — > oo, since 

l(/ - fn, Pi) | < ||/ - fn || ' \\(pi\\ -> 0 (n — > OO), 

where we used the mean convergence of {/ n } to /. In addition, the left-hand 
side of (4.39) is independent of n. Hence, taking the limit n — > oo on both 
sides of (4.39), we obtain 

(/) (pi) = °i, (4.40) 

which means that c, is the Fourier coefficient of / relative to (pi- From our 
assumption, the set {(pi} is complete and orthonormal. Hence, the Fourier 
coefficients (4.40) satisfy the Parseval identity: 

OO 

EI c *i 2 = ll/ll 2 - ( 4 - 41 ) 

fc = 1 

The results (4.40) and (4.41) are identical to condition (4.36), thus proving 
the theorem. £ 
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4.3.6 Isomorphism between £ 2 and L 2 


The Riesz-Fisher theorem results immediately in the isomorphism between 
the Hilbert spaces L 2 and £ 2 . An isomorphism is a one-to-one correspondence 
that preserves the entire algebraic structure. For instance, two vector spaces 
U and V (over the same number field) are isomorphic if there exists a one-to- 
one correspondence between the vectors a;,; in U and y t in V, say y i = f{xi), 
such that 

/ (ai®i + a 2 x 2 ) = a\f(xi) + a 2 f(x 2 ). 

The isomorphism between L 2 and £ 2 is closely related to the theory 
of quantum mechanics, which originally consisted of two distinct theories: 
Heisenberg’s matrix mechanics, based on infinite-dimensional vectors, and 
Sclrrodinger’s wave mechanics, based on square-integrable functions. From the 
mathematical point of view, the difference between the two theories reduces 
to the fact that the former uses the space £ 2 , whereas the latter uses the space 
L 2 . Hence, the isomorphism between the two spaces verifies the equivalence 
of the two theories describing the nature of quantum mechanics. 

Let us prove the above point. Choose an arbitrary complete orthonor- 
mal set {4> n } in L 2 and assign to each function / £ L 2 the sequence 
(ci, C 2 , • • • , c„, • • • ) of its Fourier coefficients with respect to this set. Since 

OO 

Y m 2 = ii/ii 2 < °° > 

*;= i 


the sequence (ci,C 2 ,--- ,c„,---) is an element of l 2 . Conversely, in view of 
the Riesz-Fisher theorem, for every element (cj , C 2 , • • • ,c n ,---) of £ 2 there 
is a function f(x) € L 2 whose Fourier coefficients are c\,c 2 , - ■ ■ , c„, • • • . This 
correspondence between the elements of L 2 and £ 2 is one-to-one. Furthermore, 
if 

f(x) < — > (ci,c 2 ,--- ,c n ,-■•) 


and 

then 


g{ x) < — > (di,d 2 ,--- 


/( x) + g{x) < — > (ci + di, ■ ■ ■ , Cn + d n , ■ ■ ■ ) 

and 

kf(x) < — > (feci, kc 2 , • • • , kc n , ■■■), 

which readily follows from the definition of Fourier coefficients (the reader 
should prove it) . That is, addition and multiplication by scalars are preserved 
by the correspondence. Furthermore, in view of Parseval’s identity, it follows 
that 

OO 

if, 9) =Y c *i di - 

i— 1 


(4.42) 
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All of these facts ensure the isomorphism between the spaces L 2 and £ 2 , 
i.e. , the one-to-one correspondence between the elements of L 2 and £ 2 that 
preserves the algebraic structures of the space. In this context, we may say 
that every element {ci} in an £ 2 space serves as a coordinate system of the L 2 
space, and vice versa. 


Exercises 

oo 

1. Prove the inequality E |c fc | 2 < ll/ll given in (4.34). 

fc= l 

Solution: Suppose a partial sum S n (x) = Ylk= l a k < fik(x), where 

ctk is a certain number (real or complex). Since the set {4>i} is 
orthonormal, 

( n n \ 

/ - E 0 ^'^)’ f -J2 a kMx) 

i=i fe=i j 

n n 

= ii/ii 2 -En 2 + Ek- c *) 2 - ( 4 - 43 ) 

k - 1 k=l 

The minimum of (4.43) is assumed if ctk = Cfe- In that case, the 
equation (4.43) reads 


\\f( x ) -^Z c kM x )\\ 2 = ii/ii 2 - E i Cfc i 2 > 

fc= l fe= l 

which implies Ek=i |cfc| 2 < II /II 2 • Since the right-hand side is 
independent of n, the value of n can be taken arbitrarily large. 
Hence, by taking the limit n — > oo, we attain the desired result: 

Er=ikl 2 < ll/ll 2 . * 

oo 

2 . Verify the equation (f,g) = given in (4.42). 

i=l 

Solution: This equality is verified because of the relations 

(/,/) = EEiM 2 and (ff’ff) = EiEMil 2 : and their conse- 
quences: 

OO 

{f + g,f + g) = (/,/) + 2(/, g) + (g,g) =^|c i + d ! | 2 

2 = 1 

oo oo oo 

= EN 2+2 E c *^ + Ei d *i 2 - * 

2 = 1 2=1 2=1 
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Orthonormal Polynomials 


Abstract The theory of Hilbert spaces we dealt with in Chap. 4 can be used to 
construct a number of polynomial functions that are orthonormal and complete in 
the sense of the L p space. In this chapter we present three important approaches 
for the construction of orthonormal polynomials, based, respectively, on the Weier- 
strass theorem (Sect. 5.1.1), the Rodrigues formula (Sect. 5.2.1), and generating 
functions (Sect. 5.2.7). We shall find that various orthonormal polynomials relevant 
to mathematical physics can be effectively classified by adopting these methods. 


5.1 Polynomial Approximations 

5.1.1 Weierstrass Theorem 

There are a number of special polynomials that play a significant role in 
various aspects of mathematical physics: Legendre, Laguerre, Hermite, and 
Chebyshev polynomials are well known. For instance, Legendre and Laguerre 
polynomial expansions are often used to solve second-order differential equa- 
tions having spherical symmetry. The point is that many of these special 
polynomials form a complete orthonormal set of polynomials; the ori- 
gin of their orthonormality and completeness can be accounted for in terms of 
the theory of the Hilbert space L 2 . Owing to completeness, these special poly- 
nomials enable us to produce polynomial approximations of fairly arbitrary 
functions with desired accuracy, which serves as a useful device in manipulat- 
ing square-integrable functions. 

The validity of polynomial approximations is based on the famous 
Weierstrass approximation theorem, which states that from the set of 
powers of a real variable x one can construct a sequence of polynomials that 
converges uniformly to any continuous function within a finite interval [a, b\. 
From this result, we shall see that it is possible to find various kinds of com- 
plete orthonormal sets of polynomials on any interval [a, b } . 

In what follows, for simplicity we focus on polynomial approximations 
only of real- valued functions of a real variable. In the case of a complex-valued 
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function, the separate validity of the theorem for each of its real and imaginary 
parts ensures the validity of the theorem. 

4 Weierstrass approximation theorem: 

If a function f(x) is continuous on the closed interval [a, 6], there exists 
a polynomial such as 

n 

G„(x) = y ^c k x k (5.1) 

fc= o 

that converges uniformly to f(x) on [a, b\. 


The proof will is be given in Appendix C. Several remarks on this theorem 
are given below. 

• In the polynomial approximation based on (5.1), the values of coefficients 
Cm' 1 depend on n for fixed to. Thus, in order to improve the accuracy of 
the approximation by going to polynomials of higher degree, the earlier 
coefficients must change. For instance, when the approximating polynomial 
(5.1) is replaced by 

n+1 

Gn+ l(x) = ^2 d k x k , 
k-0 

we have in general 

c fc yf d k for all k{< n ). 

This situation is in contrast to the case of our familiar Taylor series 
expansions, in which the earlier coefficients remain unchanged. 

• The Weierstrass theorem requires ouly that the continuity of functions be 
approximated. This condition is much weaker than Taylor’s theorem for 
expansion in power series, in which the derivatives of all orders must exists 
(i.e., it must be analytic; see Sect. 7.1.2 for the definition of analytic 
functions). Furthermore, the former theorem can apply to polynomial 
approximations outside the radius of convergence (see Sect. 7.4.1) of a 
Taylor series. 

• The Weierstrass theorem may be extended to functions of more than one 
variable. By a straightforward generalization of the proof (see Appendix 
C), it can be shown that if a function f(x i,X 2 ,--- ,x m ) is continuous 
in each variable x t located within [a.j,6j] (i = 1,2, ••• ,to), it may be 
approximated uniformly by the polynomials 

JVi N 2 N m 

Gji(,X 1, ■ *£m) ^ ^ ^ v ^ v ^‘k\k‘2---k m X X * X 

A: i—0 k2—0 km— 0 

The special cases of in = 2 and to = 3 are considered in Sects. 5.1.4 and 
5.1.5. 
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5.1.2 Existence of Complete Orthonormal sets of Polynomials 

It must be emphasized that the Weierstrass theorem requires that the set 
of polynomials {G n } be neither orthogonal nor complete. Nevertheless, the 
theorem ensures indirectly the existence of a variety of complete orthonor- 
mal sets of polynomials in terms of the L 2 space. The proof of their exis- 
tence is based on the Gram-Schmidt orthogonalization method shown 
below. 


4 Gram-Schmidt orthogonalization method: 

Given any set of linearly independent functions {tpi} normalizable on a 
closed interval, it is possible to construct an orthonormal set of functions 
{Qi} through the recursion formula 

Q i (x) = ] ^ r (i = l, 2, 

Ikwll 

with the definitions: 

i 

Ui(x) = <Pl(x), Ui{x) = ipi(x) - ipi+l)Uk(x). 

k = 1 


Here, (uk,<fii+ i) means the inner product in terms of the L 2 space. Let us 
apply the Gram-Schmidt orthogonalization process to a set of powers {x n } 
that is linearly independent. We then obtain an orthonormal set {Qi} given by 

i 

Qi (x) = b' m {i) x m . (5.2) 

m= 0 

Owing to the orthogonality of the set {Qi}, the original functions x m are 
expressed conversely by linear combinations of {Qi} such as 

m 

X m = '$2 b i m) Qi( x )- (5-3) 

i—0 

Substituting (5.3) into (5.1), we obtain 

n m 

G n {x)=^2a^^2 b t ) Qi{x). (5.4) 

m — 0 2=0 

The superscripts (n) and (m) attached to the coefficients a™ and b\ m \ 
respectively, remind us that the values of the terms contained in the finite 
sequences, 
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{ao n) >°i n) .'" > 0 £ n) } ancl {bo n \b[ m) ,--- , 

depend on n or to: as n (or m) increases, all the earlier terms in the sequence 
must be altered. 

Now let us show the completeness of the orthonormal set {Q n { x)} given by 
(5.2), which was deduced from the ortlrogonalization process; this is achieved 
by proving that Parseval’s identity, 

OO 

Ek/’^)I 2 = ll/ll 2 ’ 

n= 1 

holds for any / £ L 2 1 or equivalently, by proving that 

(f,Qn) = 0 for all n <t=4> ||/|| = 0. (5.5) 

The sentence “||/|| = 0 implies (f,Q n ) = 0 for all n” immediately follows 
from the Bessel inequality, 

OO 

£i(/,<wi 2 < ii/ii 2 - 

fc = 0 

To prove the converse, we note that if (/, Q n ) = 0 for all n, we have 

(/, G n ) = 0 for all n, (5.6) 

since the G n are linear combinations of the Q n . In addition, we recall that 
the Weierstrass theorem guarantees the uniform convergence of the sequence 
{G n ) to /. Since uniform convergence implies a mean convergence, we obtain 

11/ — G n \\ — > 0. (5.7) 

From (5.6) and (5.7), it follows that 

||/ - G„|| 2 = {f-G n ,f- G n ) = ||/|| 2 + ||G„|| 2 0, 

which implies that j|/|| = 0 as well as ||G n || 2 — * 0. (This is because ||/|| 2 is 
independent of n and ||G„|| 2 is nonnegative for all n.) As summarized, we 
attain the desired conclusion (5.5), which indicates that the orthonormal set 
{Q n } is complete in terms of the L 2 space. 

The completeness of the set {Qi} means that there exists a set of constants 
{a} such that any function g £ L 2 can be approximated in the mean by the 
following sequence of partial sums: 


n 

i = 0 


(5.8) 
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The reader should appreciate a crucial difference between (5.1) and (5.8). In 
the latter, the c, : are independent of n in contrast to the case of (5.1). Thus as 
we extend the sum to infinity, the approximation improves without changing 
the earlier Cj. Therefore, we may say that there exists an infinite series 

OO 

lim g n ( x) = YMx) 

n— »■ oo z ' 

i = 0 

that converges to g in the mean. The expansion coefficients c* = ( g , Qi ) in the 
infinite series are the Fourier coefficients we introduced in Sect. 4.3.4. 

5.1.3 Legendre Polynomials 

The previous discussion revealed that the orthonormal set {Q;} constructed 
from the ortlrogonalization process based on the set of powers {a; 171 } is com- 
plete, so that the linear combination )CILo c iQi converges in the mean to 
f G L 2 . Let us employ this result to find an explicit function form of a com- 
plete orthonormal set of functions {P n } defined on the interval [—1,1]. The 
first member of such a complete orthogonal set is Pq(x) = 1 (For convenience, 
the normalization constant is omitted temporarily) . Using the Gram-Schmidt 
ortlrogonalization process, we have 


Pi{x) 


x - (x, Pp) Pq 
\\ x- (x,P 0 )P 0 \\ 


_ X 2 - (x 2 , P 0 ) Po - (x 2 , Pi) Pi 
" \\x 2 -(x 2 ,P 0 )P 0 -(x 2 ,Pi)Pi\\ 


1 

2 


(3a; 2 


1 ), 


where we use the notation 


(x m ,P n ) 



x m P n (x)dx. 


Successive procedures give 

Ps(x) = ^ (5a; 3 — 3a;) , Pa(x) = ^ (35a; 4 — 30a; 2 + 3) , 

2 8 

P 5 (x) = l (63a; 5 - 70a; 3 + 15a;) , • • • . 

8 

Eventually, we obtain the complete orthonormal set of polynomials {P„} 
known as the Legendre polynomial. The x dependence of each function 
is plotted in Fig. 5.1. Note that P n ( x) has exactly n — 1 distinct zeros in the 
open interval [—1,1]. 
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A general formula for P n {x) is given by 


/ 2 ] 


*■»(*> = 4 E(-‘> 


(2n — 2k)\ 


t-2 k 


k=0 


k\ (n — k)\ (n — 2 k)V 


(5.9) 


where we used the Gauss notation: 


— if n is even, 


n — 1 


if n is odd. 


Equation (5.9) is rewritten in a simpler form as 

1 [n/2] ( 

w = ^ E ( 1 


J2n—2k 


2 n ^ k\ (n — k)\ dx r< 
fc= o v ' 


1 d n 
2 n n\ dx n 

1 d n 


2 n n\ dx n 


(~l) fc tl! 2n—2k 

f^ Q k\ ( n-k)\ 

(x 2 - l) n . 


(5.10) 


The last line is known as the Rodrigues formula for Legendre polynomials. 
This is a special form of the more general Rodrigues formula that is appli- 
cable to any orthonormal polynomial function. The derivations of (5.9) and 


PoM 



x 


Fig. 5.1. Profiles of the first three terms of the Legendre polynomial P n ( x ) 
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(5.10), as well as that of general Rodrigues formula are given in Sects. 5.2.1 
and 5.2.2. 

The orthogonality of the Legendre polynomials follows from the Rodrigues 
formula (5.10). To see this, we denote d n /dx n by d n , and assume that n > in. 
Dropping constant factors, we have 

J P n {x)P m (x)dx = j [d n (x 2 - 1)”] [d m (x 2 - l) m ] dx 

= [dn-^X 2 - l) n ] [d m {x 2 

- j [dn- i{x 2 - 1 )”] [dm+ l(x 2 - l) m ] dx, (5.11) 

where we employed integration by parts. Since 

d n _i(x 2 — l) n = (x 2 — 1) x (a polynomial), 

the first term in the last line of (5.11) vanishes upon putting in the limits ±1, 
leaving the second term alone. Therefore, after n partial integrations, we have 

J P m (x)P n (x)dx = (- l) n J (x 2 - l) n d m+n (x 2 - 1 ) m dx. 

Now, if n > m, then n + m > 2m so that d n+m (x 2 — l) m = 0. Therefore, 

rl 


J P n (x)P m (x)dx = 0 form ^ n. 


If m = n, then we have 


(-ir 


J (x 2 — l) n d 2n (x 2 — l) n dx, (5.12) 


J ^P n (x) 2 dx- 22n{n[)2J i 

where a normalization constant is explicitly attached. Since ( x 2 — 1)" is a 
polynomial of degree 2 n, its (2n)th derivative is just (2n)!. Hence, the integral 
(5.12) reads 


P n (x) 2 dx = 


i-i 


(2n)! • (~1) T 

2 2 ”(? r !) 2 


L^x 2 ~ l ) n dx = 2 ^ 1 ' ( 5 - 13 ) 


As summarized, the orthogonal property of Legendre polynomial functions 
is given by 

0 (m ^ n) 


P m {x)P n {x)dx = 


2n 1 


(m = n) . 
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Remark. Equation (5.13) follows from the identity 



x 2 ) n dx 


= 2 2n+1 f t n (l-t) n dt = 2 2n+1 B(n + 1, n + 1) 
Jo 

_ q 2 w +1 r( n + 1 )" _ q 2 w +1 ( n -) 2 

r(2n + 2) (2n+l)!' 


Here, we have changed the variable by setting x = 2t — 1 to obtain the beta 
function B(x, y ) and the gamma function r( x) defined, respectively, by 


B(x,y) 

B(x) 


[ t x ~ 1 {l-t) y - 1 dt = 
Jo 


n*)r(y) 

r(x + y) ' 


e~ t t x ~ 1 dt. 


5.1.4 Fourier Series 

We next consider the application of the Weierstrass theorem to functions 
of two variables. Through earlier discussions, we have the proof of the 
completeness properties of the set of trigonometric functions sin nd and cos nd 
(n = 0, 1, • • • , oo). 

The Weierstrass theorem tells us that any function g(x, y ) that is continu- 
ous in both variables on finite closed intervals may be approximated uniformly 
by the sequence of functions 


N 

9N{x,y)= a< nnlx n y rn . (5.14) 

n,m = 0 

Employ polar coordinates and restrict the domain of definition to the unit 
circle x = cos 0 and y = sin 6 to find 


N 

(/at ( cos 0, sin 0) = /jv( 0) = ^ cos" 0sin m 9. (5.15) 

n,m = 0 


Clearly, Jn{9) should be periodic with periodicity 27 t. Using Euler’s equa- 
tion, 

e l6 = cos 6 T isinfl, 

we obtain expressions for the nth powers of sin 9 and cos 9: 


cos” 9 = 




i9 


-i0\ 


sin” 9 = 


1 (e ie - , 

2 i v 


— i0\ 
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We then rewrite (5.15) in the form 

m ( m ) 

Im(x) = 7^j Tj2 einX with M = 2N ■ ( 5 - 16 ) 

n——M ' 

where we have inserted the factor (27 r) 1 / 2 for later convenience and have 
replaced the variable 9 by x to emphasize the generality of the result. 

The superscript M attached to c„ M * in (5.16) suggests the possibility that 
the values of c'n ^ are dependent of M. However, this is not the case. In fact, 
the values of the coefficients c n are determined independently of M owing to 
the completeness of the orthonormal set of functions 

ginx 

Fn(;C ) = (2^’ n = 0 ’ ±1 ’-‘-’ 

defined on the interval [— 7r,7r]. The completeness of the set {F n } allows us to 
approximate an arbitrary function / in the mean by an infinite series of the 
F n , and we write 

OO OO 

n=— oo n=— oo ^ 

where the expansion coefficients are given by 

Cn = ( F n , /) = — ^ £ f(x)e~ in *dx. (5.18) 

The series (5.17) with the coefficients (5.18) is known as the trigonomet- 
ric Fourier series. The completeness of the set {F n } can be verified in a 
discussion similar to that in Sect. 5.1.2. 


5.1.5 Spherical Harmonic Functions 

We have derived the sets of Legendre polynomials and trigonometric functions 
from the Weierstrass approximation theorem in one and two variables, respec- 
tively. We now derive the set of spherical harmonics from a three- variable gen- 
eralization. It tells us that a function g of x, y, z (i.e., r) can be approximated 
uniformly by a sequence of partial sums given by 

M 

9m( r)= a( jkn xj y kzn ■ 

j,k,n = 0 


(5.19) 
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We may also use an alternative coordinate system such as 

u = x + iy = rsin9e 1 ^, 
v = x — iy = rsin^e - *^, 
w = z = r cos 9 , 

which yields 

M 

9m(v ) = h % ) 1 u °‘ vl3wl ( 5 - 20 ) 

a, (3, 7=0 
3 M 

= b^]e i{a ~< 3)4, sm a+p Ocos^ 9. (5.21) 

1 = 0 (a,/3,7) 

In (5.21), the symbol indicates taking the sums over combinations 

of a,/3 , 7 subject to the condition a + (3 + 7 = l. [Note that the sum over all 
l in effect removes the restriction on a, /3 , 7 and gives the same results as the 
original unrestricted sum in (5.20).] 

We now restrict r to the unit sphere by requiring that |r| = 1, and 
introduce an index m = a — /?. The expression (5.21) is then rewritten in 
the form 

3 M 

9m {9, </>) = J2 E sin Q+ ^ |m| ^cos^sin'™' 9. 

I — 0 (a,/5,7) 

A trigonometric identity gives 

sin «+/3-M 0^0 = (1 -cos 2 6») (Q+/3 - |m|)/2 cos 7 6', 

which is a polynomial in cos 9 of maximum degree a + /3 + 7 — |m| = l — m] , 
since a + (3 — |m| is even (see the remark below). Denoting this polynomial 
by j)m (cos 9), we get 

3 M 

9m{9, <j>) = EE^e^sin'-l 9 f lm (cos9). (5.22) 

0 m 

Remark. That cc + /3 — |m| is even is seen by observing the identity 
a + (3 — \m\ = m — |m| + 2/3. 

On the right-hand side, 2/3 is even and 


to — TO 


0 if to > 0, 
—2 in if to < 0. 
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The range of the summation over m still has to be specified. Recall that all the 
a, /?, 7 are nonnegative integers subject to the condition that a+P+'y = l > 0 . 
This is illustrated schematically in Fig. 5.2, in which the point (a,/3, 7 ) must 
lie on the oblique face of the tetrahedron depicted in the apj space. The line 
a — (3 = m on the 7 plane is shown as a solid line. In order for it to intersect 
the oblique face, m must satisfy the condition that 

—£< m < £. 

Therefore, the sum over m in (5.22) is restricted to \m\ < l , and the last 
equation becomes 

3 M l 

3m(9, = E E b i™ Y i™(0, </>)• (5.23) 

1=0 m=—l 

Here the sequence of functions 

Y lm (0,<i>) = easin'™' 6 f lm (cos0), (5.24) 

where fi m (cos9) is a polynomial in cos 9 of degree l — \m\, provides a uniform 
approximation to any continuous function defined on the unit sphere. The 
functions Y) m are called spherical harmonics. Note that for a given l, there 
are 21 + 1 functions Y) m . 

The orthonormality of the set {l) m } is characterized by the relation 


r 



Fig. 5.2. The solid and dashed-dotted lines shown on the 7 -plane indicate the 
relation a — /3 = m for — l < m < i and m = +£, respectively. In order for the point 
(a, P, 7 ) be on the oblique face of the tetrahedron, the condition —£ < m < £ should 
be satisfied so that the solid line intersects the line segment AB on the 7 -plane 



sin 9dd Yi, m , ( 8 , 0)F im (0, </>) 


SwS 


mm' 5 



112 5 Orthonormal Polynomials 

which determines the functions Y] m uniquely up to a phase factor. 
General equations for the Y] m are 


Y lm (0,(l>) 


(- 1 Y 


21 + 1 (l — m)\ 


1 1/2 


47t ( l + m)\ 

(-1 ) m YC m {9,<j>) t rn> 0 


P/"(cos 9)e 


im(f) 


m > 0 


where 


P™(x) = (1 - x 2 ) ml2 ^YPi(x) 

l \ \ > dx m ^ > 


1 


= ) 


_ ~ 2 \ ro /2 




2^! v y da; £+m 

are called the associated Legendre functions. 


( x 2 — 1)^, m > 0, 


(5.25) 


Remark. 

1. The normalization constant of the Yj m follows immediately from the or- 
thonormality relations for the associated Legendre functions: 



prwprwdx 


( l + m)\ 2 

(i-m)! 2[+l dw ' 


There is, of course, a free choice of phase factor; ours is a common choice 
in the physics literature. However, one must be careful because different 
authors choose different phase factors for the spherical harmonics. 

2 . We should note that the associated Legendre functions P ; m ( x) are not 
another orthonormal set of polynomials on [—1,1]. In fact, they are not 
polynomials at all as is clearly seen in equation (5.25). 


Exercises 

1. Find the normalized Legendre polynomials P n (x). 

Solution: Using equation (5.13), we write the normalized Legendre 
polynomials P n (x ) as 


l2n + 1 

p n{x) = Y Pn(x), n = 0, 1, 2, - • • . x 

2. Derive the explicit form of each function: Foo> Yu, Yiq , and Yi,_i. 

Solution: It follows from (5.24) that for l = m = 0, we obtain 
y 00 = yi74 7T. If l = 1, then m can equal — 1, 0, or +1. Recalling 
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that fim (cos 9) is a polynomial in cos 9 of degree l — \m\,we obtain 
Y w = Cicos0 + C2, lii = c 3 e J<i!, sin0, Y 1 _ 1 = C 4 e~ l</> sin9. The 
constants Ci, C2, c 3 , c\ are determined by imposing orthonormality. 
For instance, 




/*7T f- IT 

d(j) / sin 9Yq 0 Yioc19 = / d0sin0(ci cos 0 + C 2 ) 

Jo Jo 

= <5oi<5oo = 0, 

f-TT p 7T 

d(f> / sin 9 \Y 10 \ 2 d9 = 2ir / d9 [sin 9{c\ cos 6 + C 2 )/ 
Jo Jo 

= ^lO^lO = 1, 


which result in Ci = ^/3/(47 t) and C2 = 0. Similarly, it follows 
that c 3 = — C4 = —i/ 3/87 r. We choose the minus sign with the 
convention to be adopted later. Therefore, the first few members 
of the set {Yj m } are 


3. 



From the generating function of Legendre polynomials determine that 


(i) P n ( 1) = 1, P n (- 1) = (-1)", 

(ii) P 2n (0) = (-1)" (2 ”~|. )!! , P 2n+ i(0) = 0 with (-1)!! = 1, 

(2 n)\\ 

f 1 2 n (r ?') 2 

< iH > l =(//)!■ 

Solution: We use the equation 

OO 

(1 — 2 tX + f 2 )^ 1 / 2 = ^2 Pn{x)t n . 

n= 0 

(i) For x = 1, we have 

-j OO OO 

tv 7 = £ ( "-£a.( 1 ) t ". 

n= 0 n= 0 
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which yields P n ( 1) = 1. Similarly for x = — 1, we obtain 

1 OO OO 

7 = E(-i) nt " = E p "(-i)*" 

n — 0 


1 + t 


n — 0 


which gives P„(— 1) = (—1)™. 
(ii) For x = 0, we have 


1 


1 + t 2 


1/2 


= E(-D’ 


n — 0 


(2 n n\) 


t 2n = E 


n— 0 


Then, we have the desired result. 


(iii) We get the relation by performing an integration in parts n 
times. X 


4. Show that the Coulomb potential at r = ro experienced from the unit 
charge at z = a on the z-axis is given by 

1 OO 

nn) = - 4 

n = 0 

where 9 is the angle between the 2 axis and the vector r o and a satisfies the 
condition rg < a. 


Solution: Using the generating function of Legendre polynomials, 
we have 


V(r 0 ) = 


1 


1 


1 


1 


4tt£ 0 |rg — a\ 47re 0 + a 2 — 2 aro cos ( 


OO 


47T£f 


n — 0 


This series converges because r 0 < a and |P„(cos0) < 1|. X 


5.2 Classification of Orthonormal Functions 

5.2.1 General Rodrigues Formula 

In the previous section we saw that several kinds of orthonormal polynomi- 
als can be produced through the Gram-Sclimidt ortliogonalization process by 
starting with 1, x, x 2 , ■ ■ ■ . However, there is a more elegant approach that ap- 
plies to most polynomials of interest to physicists. This section describes this 
approach, which is based on the Rodrigues formula and classifies various 
orthogonal polynomials in terms of the parameters involved in the formula. 
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4 General Rodrigues formula: 

1 d n 

Qn{x)= ^ — ^—[w(x)s n (x)] (n = 0,1,2,- ••), (5.26) 

K n w{x) dx n 

where it is assumed that 

1. Q i(x) is a first-degree polynomial in x. 

2. s(x) is a polynomial in x of degree no more than 2 with real roots. 

3. w(x) is real, positive, and integrable in the interval [a, b] and satisfies 
the boundary condition 

w(a)s(a) = w(b)s(b ) = 0. 


Equation (5.26) under the three conditions noted above provides the sequence 
of functions (Qo(x), Qi(x), Q 2 (x), ■ ■ ■ ) that forms an orthogonal set of poly- 
nomials on the interval [a, b] with a weight function w(x), which can be nor- 
malized by a suitable choice of constants K n . For historical reasons, different 
polynomial functions are normalized differently, which is why K n is introduced 
here. In the meantime, we omit denoting K n without loss of generality. 


6 Theorem: 

The function Q n (x) defined by (5.26) is a polynomial in x of the nth 
degree and satisfies the orthogonality relation on the interval [a, b] with 
weight w(x): 


/ p m (x)Q n {x)w(x)dx = 0 (m < n), 

J a 

where p m {x) is an arbitrary polynomial of degree to < n. 


(5.27) 


Proof From hypothesis, we have 

= 0 (if to < n) (5.28) 

x=a or b 

and 

[w(x)s n (x)p(< k) ( x)] = w(x)s n ~ m (x)p(<k +m) , (5.29) 

where the symbol p(<^)(a:) denotes an arbitrary polynomial in x of degree 
< k. Then, integrating (5.27) by parts n times, we obtain for to < n, 


dx r 


[w(a;)s n (a:)] 





116 5 Orthonormal Polynomials 


J Pm(x)Qn(x)w(x)dx = J [tu(ar)s n (ar)] dx 

r b d n 

= / w(x)s n (x)—— p m (x)dx =0, (5.30) 

Ja dx n 

where we used (5.26) and (5.28). Next we examine whether or not Q n is a 
polynomial of degree n. Set n = m and k = 0 in (5.29) to obtain 


1 d 


Xx)d3^ = Qn{x) =P(<n)(x), 


which indicates that Q n (x) is a polynomial of degree no more than n. We thus 
tentatively write 

Qn(x) = P(<n- i)(x) + a n x n , (5.31) 

and would like to show that a n ^ 0. Multiplying both parts of (5.31) by 
Q n (x)w(x ) followed by integrating on [a, b] yields 


2 [ b [ b 
[Qn{x)\ w(x)dx = / P(< n -i)(x)Q n {x)w{x)dx + a n J x n Q n (x)w(x)dx 

J a J a 

= a n x n Q n {x)w{x)dx , 

J a 


where we used (5.30). This clearly proves that a n yf 0, i.e., that Q n ( x) is a 
polynomial of the nth degree. Jft 


5.2.2 Classification of the Polynomials 

In what follows, we classify the orthogonal polynomials that are derived from 
the Rodrigues formula (5.26) the three conditions according to noted earlier. 
By the condition 1 associated with (5.26), Q\(x) is a first-degree polynomial, 
and we can define it as 

Q 1 (x) = —£r. (5.32) 

hi 

Then the Rodrigues formula (5.26) reads 

1 dw = x + ( ds/dx ) 
w dx s 

Recall that s(x) can be the zeroth-, first-, or second-degree polynomial. In 
each case, we can find an appropriate weight function w(x) that satisfies the 
differential equation (5.33) as well as the boundary condition 3: 


w(a)s(a) = w(b)s(b ) = 0. 


(5.34) 
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Such discussions determine the explicit forms of possible functions s(x) and 
w(x) under conditions 1 3 in Sect. 5.2.1 and then allow classification of all 
of the orthogonal polynomials provided by the general Rodrigues formula 
described below. 


Hermite polynomials: 


We first consider the case that s(x) is a zerotlr-degree polynomial, i.e. , a 
constant given by 

s(x) = a. 

Equation (5.33) takes the form 


and has the solution 


1 dw x 

w dx a 


w(x) 



with a constant A. 


(5.35) 


The product w(x)s(x ) vanishes only at x = ±oo, provided that a > 0. To 
satisfy the conditions in (5.34), we have to set 


a = —oo, b = +oo. 


The constants A and a affect only the multiplicative factor in front of each 
polynomial. Thus, without loss of generality, we can take a = 1 and A = 1, 
which yields 


The complete orthonormal polynomials corresponding to this case are known 
as Hermite polynomials, designated by H n { x), and satisfy the orthonormal 
condition 

/ OO 2 

£ Hm{x)H n {xd)dx = 5 mn . 


Laguerre polynomials: 

Next we let s(x) be a polynomial of the first degree, such as 

s(x) = j3(x — a). 

The Rodrigues formula (5.26) now becomes 

1 dw x + f3 
w dx /3(x — a) ’ 
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which has the solution 


w(x) = const, x (x — a) v e x ^ 


where 


v = — 


oi + /3 

~T~ 


If j3 > 0 and v > — 1, then s(x)w(x ) vanishes at x = a and x = +oo, and 
w(x) is integrable in the interval [a, +oo). The simplest choice is therefore to 
take a = 0 and j3 = 1, which yields 


w = x v e x , a = 0, b = +oo. 


These choices result in the Laguerre polynomials, commonly denoted by 
L^(x), whose orthonormality relation is given by 

x~ v L'^ n {x)L' y n {x)dx = 6 mn with v > —1. 



Jacobi polynomials: 


Finally, let us take 


s(x) = j(x — a)(/3 — x), (i > a. 

Here we assume that s(x) has two distinct roots. [If s(x) has a double root, the 
boundary condition (5.34) cannot be satisfied, since in this case the function 
s(x)w(x) cannot vanish at more than one point.] The Rodrigues formula (5.26) 
now reads 

1 dw x + 7 (/? — x) — j(x — a) 
w dx j(x — a)(/3 — x) 

which has the solution 


w(x) = const, x (x — a) M (/3 — x ) u , 


with 


M = - 


~T~ 


and 


1-7 

7 


a 

~ «) ’ 


If fj, > — 1 and v > — 1, then s(x)w(x) vanishes at x = a and x = j3, and w(x) 
is integrable on the interval [a,/?]. With the replacement 


2x — a — f3 
f3 — a 


x, 


apart from multiplicative factors, we obtain 
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w = (1 — x) v (l + xY with v, p, > — 1, a = — 1, b = +1. 

The corresponding complete orthonormal polynomials are called the Jacobi 
polynomials G^(x), and satisfy the relation 

J (1 - £) -Iy (l - x )~^ G m V i x ) G n V { x )dx = s mn with u,p> -1. 

Remark. Jacobi polynomials can be divided into subcategories depending on 
the values of p and v. The most common and widely used in mathematical 
physics are collected in Table 5.1. 


Table 5.1. Special cases of Jacobi polynomials 


d 

V 

w(x) 

Polynomial 

A - 1/2 

A - 1/2 

(l_ a; 2 ) A — 1/2 

Gegenbauer, C„(x). A > —1/2 

0 

0 

1 

Legendre, P n (x) 

-1/2 

-1/2 

(1-x 2 )- 1 / 2 

Chebyshev of the first kind, T n (x) 

1/2 

1/2 

(1-x 2 ) 1 / 2 

Chebyshev of the second kind, U n (x) 


5.2.3 The Recurrence Formula 

We now show that all the orthogonal polynomials derived from the Rodrigues 
formula (5.26) satisfy the following relation: 

Recurrence formula: 

Qn+ 1(*£) — ( a n X T b n ) Qn( x ) CnQn— i(*£)> {ri — 1 , 2, • • • ) (5.36) 

where the constants a n , b n , and c„ depend on the class of polynomials 
considered. 


Proof The only property needed for the proof of (5.36) is the orthogonality 
relation: 

r b 


/ Qn{x)p( <n )(x)w{x)dx = 0, 


(5.37) 


where the symbol P(< n )( x ) denotes an arbitrary polynomial in x of degree less 
than n. For convenience, we introduce the following notation: 


= coefficient of x n in Q n (x), 
r/ n = coefficient of x n ~ 1 in Q n ( x), 

fh 

In= Ql(x)w(x)dx. 

J a 


(5.38) 

(5.39) 
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It then follows that 


Qn+i(x) - ' : I !} p-xQ n (x ) = y ^r^Qjjx) 

S n 


i = 0 


(n) 

because the left-hand side is a polynomial of degree < n; r\ are appropriate 
constants determined by the left-hand side. Multiplying both sides by wQ m , 
taking m equal to 0, 1, 2, • • • , n — 2 successively, and using the orthogonality 
relation (5.37), we obtain 


Thus 


r m^ = 0 for m = 0, 1, 2, • • • , n — 2. 


Qn+ l(x) - ^^-xQn(x) = r^Q n (x) + r^l x Q n - 1 (x) , (5.40) 

Sn 


which is the recurrence formula we are looking for. X 


5.2.4 Coefficients of the Recurrence Formula 

We now have to find the constants ri"'* and r^ l 2 1 in (5.40). In view of the 
orthogonality relation (5.37), we have 

pb pb 

I n = / Ql{x)w{x)dx = £n Q n {x)x n w{x)dx. (5-41) 

J a J a 

Multiplying (5.40) by wQ n -\ and integrating, we obtain 


(n) _ in + 1 f 


T _ __ 

1 n- l'n-1 — 


Qn{x)Q n -\{x)xw(x)dx 


jn+1 _ jn - 1 f 1 

in in J a 

in+lin—1 


Qn{x)i n X n w(x)dx 


i 2 

S n 


~In, 


Therefore, 


(n) In in+lin—1 


r'_ i = - 


In- 1 H 


(5.42) 


Substituting this into (5.40) and comparing the coefficients of x n on both 
sides, yields 

r p l) = _!?n ± i . (5.43) 

sn Sn 

Finally, it follows from (5.40)-(5.43) that the coefficients a n , b n , and c n defined 
in (5.40) become 



5.2 Classification of Orthonormal Functions 


121 


O'n — 


£n+l 


_ £n+l f Vn+1 Vn 

Zn V £n+l £n 


In— 1 


£n+l£n— 1 
Sn 


(5.44) 


The constants £ n and 77 ^ can, in principle, be found from the Rodrigues 
formula once the functions s(:r) and w(x) as well as the constants K n have 
been fixed. The constants which determine the normalization of the 
polynomials, are given by 

In = f S (x) n w(x)dx. 

■K-n J a 


This follows immediately from the Rodrigues formula if we integrate n times 
by parts the integral 


nb nb 

In = Q n {x) 2 w{x)dx = £„ / Q n (x)x n w(x)dx 

J a J a 


K n 


r b , jn 

xn dx™ l s ( x ) nw ( x )] dx - 


Although the explicit form of the coefficients given in (5.44) seems rather 
complicated, the corresponding recurrence relation for a specific orthogonal 
polynomial simplifies it considerably. 


5.2.5 Roots of Orthogonal Polynomials 


Consider the recurrence formula (5.36) in which the polynomials Q n (x) are 
normalized, and from (5.39) I n = 1 (n = 0, 1,2, •••)• After some rearrange- 
ment, the equation takes the form 

xQn-l(x) = ^ Q n {x ) + y^Qn-2{x) + 0 n -iQ n -i(x), 

sn sn— 1 


where 


(In- 1 — 


Vn - 1 
£n— 1 


Vn 

Zn 


The matrix form is given by 
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( Qo \ 


</~n 

0 

0 

0 \ 


( Qo \ 


Qi 


W£i &/6 

0 


Qi 

X 

Q 2 

= 

0 C 1 /C 2 02 

0 


Q 2 


\Qn-i ) 


• O 

• O 

• O 

Pn-i ) 


\Qn-1 ) 


( ° \ 

0 

0 


V (6v-i/6v)Qat / 


which gives the eigenvalue equations provided that {ay} are the roots of the 
polynomial equation Qn(x) = 0 such that 


JR(xi) = XiR(xi), 

where the column vector i?(ay) is defined by 

R(.Xi) — (^h) , Ql {Xi) , Q]\J— 1 (ay)] • 

Thus, the eigenvalues of the N x N matrix J are the zeros of Q n{x). The 
matrix is called the Jacobi matrix associated with the sequence {Q n {x)}. 
Since J is symmetric, the eigenvalues {ay} are real. We thus have proved the 
following theorem: 

4 Theorem: 

The eigenvalues {a;,;} (i = 1, 2, • • s , N) of the matrix J are the 
zeros of Qn{x). The eigenvector belonging to Xi is R(xi) = 
[Qoi.Xi ') , Ql(ay), Qn— 1 (*^y)] • 


5.2.6 Differential Equations Satisfied by the Polynomials 

Historically, most orthogonal polynomials were discovered as solutions of 
differential equations. Here we give a single generic differential equation that 
is satisfied by all the polynomials Q n . 
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6 Theorem: 

All of the orthogonal polynomials Q n (x) derived from the general Ro- 
drigues formula (5.26) satisfy the differential equation 


d ( dQ n 
dx \ W ~fa 


XnVjQni 


with the constant 


A n = -n 



n — 1 d 2 s \ 

2 dx 2 J ' 


Proof Since dQ n {x) / dx is a polynomial of degree < (n — 1), it follows from 
(5.29) that the function 


l_d 
w dx 


s(x)w(x) 


dQn 

dx 


is a polynomial of degree <n. Thus, we can write 


la l_ 

w dx 


s(x)w(x) 


dQn 

dx 


= ~Y, X nQi{x), 
2=1 


(5.45) 


where the are undetermined constants. Multiplying both sides of (5.45) 
by wQ m and integrating, we get 


J a Qm{x) dx 


s(x)w(x) 


dQn 

dx 


dx = -A Wl m . 


(5.46) 


Here I m is an integral given by (5.39). Integrating by parts, for m < n the 
left-hand side of (5.46) yields 


s(x)w(x) 


r b d 

/ Qm{x )-£- 
Ja dx 

f b 

= — s(x)w(x) 

J a 

= / w(x)Q n (x) 
J a 


dQr 


dx 
dQn dQ 


dx dx 
'1 d 
w dx 


dx 

dx 


s(x)w(x) 


dQn, 

dx 


dx 


= 0 . 


We have used the condition that s(a)w(a) = s(b)w(b) = 0, which is assump- 
tion 3 in Sect. 5.2.1. We also used the fact that Q n (x) is orthogonal to any 
polynomial of degree < n. Consequently, we arrive at the result 
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= 0, for m < n. 

Setting 

\( n ) — \ 

for simplicity, we can rewrite (5.45) in the form 

^ s(x)w(x) < ^ L = -w(x)X n Q n (x), (5.47) 

which is the differential equation satisfied by a polynomial Q n (x). The con- 
stant X n can be found by setting to = n in (5.46) and integrating, as we 
demonstrate later in Exercise 4. 


5.2.7 Generating Functions (I) 

As a matter of fact, all the orthogonal polynomials Q n (x ) discussed thus far 
can be generated from a single function g(t,x) of two variables by repeated 
differentiation with respect to t. Called a generating function, it plays a 
significant role in many areas of mathematics. Here we study the essence of 
generating functions together with several examples by which we can derive 
specific orthogonal polynomials. 

A formal definition of generating functions is given below. 

4 Generating function: 

Assume a (finite or infinite) convergent power series 

7 (t) = £/ fc t fc . 
k 

The 7 (f) is called a generating function for the sequence of coefficients 

fl 5 j 2 5***5 fn 5 * * * • 

Clearly, all the coefficients /„ are obtained from differentiating 7 (t) as given 

by 

_ 1 d" 7 (t) 

n n\ dt n 

For orthogonal polynomials, generating functions are assumed to take the 
form 

OO 

g(t, x) = ^2 A nQn{x)t n , (5.48) 

n = 0 

where Q n {x ) is an orthogonal polynomial associated with g(t,x), and the A n 
are appropriate constants. The explicit form of g(t, x) can often be derived us- 
ing the Rodrigues formula and Cauchy’s integral formula (see Sect. 7.3.1). 
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Remember that the latter formula determines an nth-order derivative of a 
function f(z) as 

= l /' /( QdC 

dz n 2m J c (C - «)"+! ’ 

where f(z) is analytic within the closed contour (7. (See Sect. 7.1.2 for a 
definition of analytic functions.) Applying this to the Rodrigues formula 
for, say, Hermite polynomials H n (x), we obtain 


H n (x) 


L 11 V ^ 2 / 2_ p -* 2 /2 

1 j dx n 




c 


e ^/ 2 dC, 
(C-x) n+1 ‘ 


We then try to sum the series as 


OO 


E 

n=0 


H n (x) 


e* 2 / 2 

27T* 



OO 


E 


(-i ) n t n 
(C - x) n+1 


d( 


e x 2 / 2 / e ^ 2 / 2 dC, 

2 7T« J c ( — X + f ’ 


where we require that the point x— f be inside the contour. Finally we evaluate 
the above integral and find 


r tx-(t 2 / 2 ) _ H n (x)t n 
~ ^ n! ' 

n=0 

Comparing this last equation with (5.48), we see that e tx ~ / 2 ) is the gener- 
ating function associated with Hermite polynomials H n {x). Similarly, we can 
derive the generating function for Laguerre polynomials as 


e tx / (1 t) 

(i - ty+ a 


Y. L n{*)t n - 

n— 0 


5.2.8 Generating Functions (II) 

There is an alternative way to determine a generating function, which is based 
on the recurrence formula for a particular polynomial. To see this, we try to 
find the generating function of the Legendre polynomials that satisfies the 
following recursion formula: 

(n + l)P„+i(x) - (2 n + 1 )xP n (x) + nP n - i(x) = 0, 

with Po(x) = 1, Pi(x) = x, and for convenience we set P_i(x) = 0. We seek 
an expression in closed form for 


g(t,x) = ^ ~^P n (x)t n . 

n = 0 
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First we note that 

dg 

dt 


^ ~2nP n (x)t n 1 = y^(n + l)P n+1 (x)t n 

n = 0 n — 0 

oo 

^ [(2n + l)a;P„(x) - nP„_i(i)] t n . 

n = 0 


By straightforward rearrangement we find that 

% u ^ ^3 , /. x ,2% 

— = xg(t,x) + 2tx-^- ^tg(t,x) -t — , 


which leads to the partial differential equation 

1 <9<7 a: — t 

g dt 1 — 2ta + t 2 ' 


Coupled with the initial condition g(0,:r) = 1 we finally have 


1 

Vl — 2te + t 2 


n=0 


Generating functions for other orthogonal polynomials are given in 
Appendix D. 


Exercises 

1. Find the recurrence formula for normalized polynomials Q n (x). 

Solution: When the polynomials are normalized, we have I n = 1 (n = 
0, 1, 2, • • • ) from (5.39). The recurrence formula (5.36) is 

Qn+ 1(*^) (u n £ T 6 n ) Q n (x) (^ n _i(x). ^ 

^n— 1 

2. Assume that a sequence of orthogonal polynomials satisfies 

Q n +i{x) = [(n + l)x + 1] Q n (x) - 3 (n + l)<9„_i(a;). 

Find the normalized constants for Q n (x) defined by Q n (x) = XQ n (x), 
where Q n (x) are normalized polynomials. 
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Solution: We denote the normalized polynomials as Q n ( x) = 
XQ n (x), where the constants A„ (n = 0, 1, 2, • • • ) are to be found. 
Substituting Q n (x) into the given formula, we have 

Q n +i(x) = [(n + l)x + 1] Q n (x) - 3(n + l)^^-Q„-i(x). 

An A n 

Comparing this with the normalized recurrence formula from 
Exercise 1, we have the relation (3A„)/A„_i = A n _i/(nA n ), which 
yields A„ = A n _i /y/Sn. This relation gives the normalization 
constants of the form 

X n=^W 1/2 AO- * 


3. Find the recurrence formula for Hermite and Legendre polynomials. 
Solution: For Hermite and Legendre polynomials, (5.36) reads 

H n+1 (x) = 2 xH n (x) - 2nH n _i{x) (5.49) 

and 

(n + l)P n+ \{x) = (2 n + 1 )xP n (x) - nP„_ \{x), (5.50) 

respectively. See Appendix D for the recurrence relations associ- 
ated with the other polynomials we have discussed. Jit 


4. Determine the constants A„ given in (5.47). 

Solution: Setting m = n on the left-hand side of (5.46), we obtain 




s (x)w(x) 


dQ n 

dx 


dx 


(5.51) 


— I 

J a 

= / w(x)Q n (x) 
J a 


d(sw) dQn . n / \ d Qn 
— — + 


dx 


K\Qi{x) 


dQn 

dx 


s ( x ) 


d 2 Qn 

dx 2 


dx. (5.52) 


Here we used the relation d(sw)/dx = wKiQi [set n = 1 in the 
general Rodrigues formula (5.26).] The orthogonality of Q n (x) 
means that only the nth power of x in the square brackets con- 
tributes to the integral in the last line of (5.52). [See (5.30) for 
details of the orthogonal property of Q n (x).] We then set up the 
following expressions: 
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s(x) = ax 2 + bx + c, 

Qn{x) = inX n + Zn-lX ^ 1 H , 

< 9 i(*) = 771 + 770, 


which result in 


KiQi—r^ = K 1 V 1 n£, n x n + (const.) xi" 1 + 
ax 


and 


d 2 Q„ 
dx 2 


= cto (?7 — l)£ n x n + (const.) x x n 1 + 


Thus the relevant terms in the square brackets in the last line in 
(5.52) become 


dQi Ids 2 


CnX n , 


where we used 771 = dQi/dx and a = (1/2 ){ds 2 /dx 2 ), and we get 


f Qn(x)^r- s(x)w(x ) d ^' 1 
a dx 1 dx 

r , dQi n . d 2 s 

nK 'H^ + 2 (n_1) *5 


dx 

°b 


^x(x)Qrn)x) )t/nX ) dx 


= n ( Ki 


dQ 1 n — 1 d 2 s 




dx 2 dx 2 
Comparing this with (5.46), gives us 

dQi ?7 — 1 d 2 s 


A„ = —77. ( K 


dx 


2 dx 2 


5.3 Chebyshev Polynomials 

5.3.1 Minimax Property 

Thus far we have seen that every real function f{x) defined in a certain interval 
(finite or infinite) can be approximated in the mean by appropriate orthogonal 
polynomial {Q n (x) } as 

n 

f(x) ~ ^2 c iQi( x )• (5.53) 

i = 0 

The coefficients c* are determined formally by using the orthogonality of the 
polynomials in question. The striking advantage of such polynomial approxi- 
mations is that an improvement in the approximation through addition of an 
extra term c n +iQ n +i(x) does not affect the previously obtained coefficients, 

Q) 7 ^1 1 1 Cn • 
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In principle, any polynomial that we discussed in Sect. 5.2 can be approx- 
imated using (5.53). From the point of view of numerical analysis, however, 
the Chebyshev polynomial {T n (x)} is the best choice, primarily because at 
any point x within the domain [—1,1], the function T n (x) has the smallest 
maximum deviation from the true function fix') to be approximated. This 
property, which is unique to Chebyshev polynomials, is known as the mini- 
max property. In general, polynomials endowed with the minimax property 
are very difficult to find, but fortunately, the Chebyshev polynomials fall into 
this category an, moreover, are easy to compute. 

To show the minimax property of Chebyshev polynomials, we have to be 
aware of two of their other properties. The first is a concise formula for T n {x) 
that is an alternative to those based on the Rodrigues formula. 

4 Concise formula for Chebyshev polynomials: 

T n {x) = cos (ncos -1 x) (n = 0,l,---). (5.54) 


The derivation of (5.54) requires some lengthy calculations, so we put it in 
the next subsection (see Sect. 5.3.2). Equation (5.54) implies that each T n (x) 
has n zeros in the interval [— 1, 1], which are located at the points 



(k= 1,2,- •• , n). 


(5.55) 


In this same interval, there are n + 1 extrema (maxima and minima), located 
at 

x = cos ^ — kj (k = 0,1, •••,«.). 

Note that T n { x) = 1 at all of the maxima, whereas T n {x) = —1 at all of the 
minima. This feature of T n is exactly what makes the Chebyshev polynomials 
so useful in polynomial approximation of functions 


Remark. Equation (5.54) combined with trigonometric identities can yield 
explicit expressions for T n (x): 


T 0 {x ) = 1, Ti{x) = x, T 2 ( x) = 2a; 2 - 1, T 3 (a;) = 4a; 3 — 3a;, ■ ■ • > 

and more generally, 

T n + i(x) = 2xT n (x) - T n _i{x) (n > 1). 

The last expression is a special case of the general recurrence formula (5.40) 
derived in Sect. 5.2.3. 

The second property of Chebyshev polynomials to be noted is the discrete 
orthogonality relation described below. (The proof is given in Sect. 5.3.3.) 




130 5 Orthonormal Polynomials 


6 Discrete orthogonal relation: 

If Xk (k = 1, • • • , n) are the m zeros 

of T n {x ) given by (5.55) and i,j < n, 

then 

n 

[ 0, j , 

y ^Tj{xk)Tj{xk) = 

< n/2, i = j± 0, (5.56) 

k= 1 

3 

II 

II 

o 


From (5.54) and (5.56), we obtain the following theorem: 


6 Theorem: 

Suppose f{x) to be an arbitrary function in the interval [—1,1] and 
define Cj (j = 1, • • • , n) by 

2 n 

Cj = n f(xk)Tj~i(x k ), (5.57) 

k = 1 

where x k is the kth zero of T n (x) given by (5.55). We then have 

n 

f(x ) = ^2 c kTk-i(x) - y for all x = x k - (5.58) 

k= 1 

What is remarkable is the fact that for x = Xk, the finite sum in (5.58) is 
equal to f(x) exactly. For x ^ Xk, the sum in (5.58) just approximates f(x)\ 
nevertheless the error can be reduced by increasing the degree n of the sum. 
Moreover, for practical use, we can truncate the sum in (5.58) to a much 
lower degree, for even if we do so, the approximation (5.58) is sufficiently 
accurate over the whole interval [—1,1], not only at the zeros of T n (x). This is 
in contrast to the case of approximations based on other polynomials, where 
the degree of summation n should be taken as large as possible to obtain 
high accuracy. In fact, this truncation capability is the reason Chebyslrev 
polynomial expansion is far better than the other choices. 

To examine the above statement, let us suppose that n is so large that 
(5.58) is virtually a perfect approximation of f(x). We then consider the 
truncated approximation 

m 

f(x) ~ ^2 CkT k -i(x) - j with m < n, (5.59) 

k= 1 

where the coefficients Ck are given in (5.57). The difference between (5.58) 
and (5.59) is given by 
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CfeTfc- i(x), (5.60) 

k = m -\- 1 

which can be no larger than the sum of the neglected Cfc’s as the T„(a;)’s are 
all bounded between ±1. 

Now we consider the magnitude of the sum (5.60) . We know that in general 
the Cfc’s decrease rapidly with k, which follows intuitively from the definition 
(5.57). Hence, the magnitude of (5.60) is dominated by the term c m+ iT m (x), 
which is much less than unity for all x € [—1, 1]. In addition, c m+ iT m (x) is an 
oscillatory function with m + 1 equal extrema distributed almost uniformly 
over the interval [—1, 1]. These two features of the dominant term c m +iT m (x) 
result in smooth spreading out of the error of the approximation (5.59). This 
context implies that the Chebyshev approximation (5.59) is very nearly the 
same as the minimax polynomial that has the smallest maximum deviation 
from the true function f(x). 

5.3.2 A Concise Representation 

The aim here is to derive the alternative representation of Chebyshev poly- 
nomials given in (5.54): 

T n (x ) = cos [ncos _1 (a:)] . 

We know that Chebyshev polynomials satisfy the relation 

(1 - x 2 )-^T n (x) - x-^T n (x) + n 2 T n (x ) = 0, 
which can be rewritten in the form 

it i^ 1 ~ x2 Jr TnlyX ^) + Vl-x> Tnix) = °‘ ( 5 - 61 ) 

We now apply the following lemma: 

Lemma: 

Let p(x) and q{x) be two positive, continuously differentiable functions 
that satisfy the differential equation 

1, |!p(-f'),f y( j: ) +q(x)y{x) = 0. (5.62) 

ax l ax 

If the product p(x)q(x) is nonincreasing (or nondecreasing), then the rela- 
tive maxima of [y(x)} 2 form a nondecreasing (nonincreasing) set. 
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(The proof of this lemma is outlined in Exercise 1.) We can see that if 

p(x) = \/l - x 2 and q{x) = , 

Vl — x z 

(5.61) corresponds to (5.62), which implies that the product pq is constant. 
Thus, according to the lemma, all relative maxima of T%(x) must assume the 
same value. 

Now we seek a polynomial T n (x) of degree n that satisfies the condition 
T%(x) = 1 whenever T' n (x ) = 0. 


That is, T%(x) = 1 at all x where T 2 ( x) has a relative maximum equal to 1. 
Clearly at these points, both T%(x) — 1 and [T^(x)f have double zeros. Then 
the function 


T^{x) - 1 

KW ] 2 


(5.63) 


is a rational function and all the zeros of the denominator also occur in the 
numerator. Is we compare the degree of the polynomials in the denominator 
and in the numerator, it follows that (5.63) is a quadratic, and without loss 
of generality we have 


TZ(x)~ 1 
KW ] 2 


a(x 2 — 1). 


(5.64) 


The constant a can be determined by dividing both sides by x 2 and letting 
x approach infinity. Then, inserting a polynomial of degree n for T n (x), we 
obtain 

— ^ = a so that T n (x) = cos [ncos^ 1 x + c] , 


which yields 




(5.65) 


Equation (5.65) is a differential equation for T n {x) that determines the 
explicit form of our desired T n {x). To solve it, we set 


T n (x) = cos 0, x = cos (/), 


where 9 and <f> are functions of x. We then have 

Tn( x ) — 1 = — sin 2 6 


d ( d \ d(j) sind d9 

— T„(a;) = cos j — = 


and 



5.3 Chebyshev Polynomials 133 


Substituting these in (5.64) yields 


/ de \ 2 2 

l — l =n so that 9 = ±n6 + c, 

\d(j>) 


and we get 


T n (x) = cos (?rcos 1 x + c) . 
To determine c, we note that 

T n 2 (±i) = l = cos(c). 
Hence, c = 0 and we eventually obtain 

T n (x) = cos (ncos^ 1 x) . 

5.3.3 Discrete Orthogonality Relation 


(5.66) 


We close this section by proving the discrete orthogonality relation (5.56) for 
Chebyshev polynomials. 

Proof (of the discrete orthogonality relation): Let Xk (k = 

1, 2, ■ • • , n) be the n zeros of T n (x), which is given by 

(k= 1,2,- •• , n). 

Then the value of Tg(x) at x = Xk, in which £ < n is assumed, reads 
T((xk) = cos [£cos _1 (a;fe)] = cos 
Using the trigonometric identity, we have for £, m < n, 



7T 

/, i\i 

COS 

n 



7rf 

(k 'll 

| 

71 



T e (x k )T m (x k ) 

ir(£ + to ) 


1 

= -cos 


2 n 


(2k - 1 ) 


+ -COS 


7 r(£ — to) 

2 n 


(2k - 1 ) 


If £ = to = 0, this equals 1 so that we obtain 

n n 

Y J T t (x k f = Y J ^ = n. 


(5.67) 


(5.68) 


fc=i 


fc= l 


Otherwise, if £ = to 0, the second term in the last line of (5.67) 
equals 1/2 and we have 


n i n 

= T + 9 u 


fc=l 


COS 

“ k= 1 
n sin(2^7r) 


£tt 

— (2k -1) 

n 


2 4sin(f7r/n) 


(5.69) 
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where we used the equation (see Exercise 2) 

y cos(2 k l)x = Si ^ 2nX (for x*G). 

' 2 sm x 

k = 1 

In a similar manner, for the case t ^ m we find that 

n 

yT e (x k )T m {x k ) = 0. (5.70) 

fc= i 

Equations (5.68), (5.69), and (5.70) together are identical to the de- 
sired result given in (5.56). £ 


Exercises 


1. Prove the lemma associated with the differential equation (5.62). 


Solution: The proof is based on the nondecreasing property of the 
function defined by 


/O) 


b (*)] 2 


[p(x)y'{x)} 2 

p{x)q(x) 


in which the functions y(x), p(x), and q{ x) are assumed to satisfy 
the differential equation (5.62). The nondecreasing property of 
f(x) is verified by seeing its derivative: 


f(x) = 2 yy' + — (p?/)' + ( — 


pq 


pq 


( r\2 (PQ)' , M2 

( py ) =--^y(py) , 


where we used the condition (5.62). From hypothesis, pq is 
nonincreasing, which implies ( pq )' < 0. Hence, it is readily seen 
that /' > 0, i.e., that / is nondecreasing. 

Now we realize that, y' must vanish wherever y(x ) 2 has a rel- 
ative maximum so that f(x) = y 2 . Suppose that X\ and x% are 
two successive zeros of y', such that x\ < x 2 . Since f(x) is nonde- 
creasing, we have /(a^) > f{x 1 ), or equivalently, y 2 [x 2 ) > y 2 [x 1 ), 
which means that the relative maxima of y 2 form a nondecreasing 
set. This completes the proof of the lemma. £ 


2 . Prove that ^ cos(2 k — \)x = (f or j^q). 


fc= 1 


2 sin x 


Solution: This equation is obtained by considering the sum 


N 


N 


gi(2fe-l)x _ g -ix 
fe = 1 fc = 1 


^2 ikx ix 


1 _ e 2i(AT+l)a 


1 — e : 


2 ix 


- 1 


= e i(N-i)x . sin(jV + l)x _ p _ ix 


smx 
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Taking the real part of both sides yields 


N 


E , . , . sin (IV — (— X ) 

cos(2fc — \)x = cos (Jy — l)x • : cos a: 


fc=i 


sin x 


sin 2N x + sin 2x 2 sin x cos x sin 2Nx 


2 sin x 2 sin x 

3. Derive the formula for Clrebyshev polynomials: 

1 — t 2 

= 1 o \%) + i 

m—1 


2 sin x 


= T 0 (x) + 2j2T m (x)t m , 


1 — 2 tx + t 2 

where |f| < 1. Then, using this equation, prove that 


c27r 


cos 0 


cL0 = 


2t Tt n 


1 — 2 1 cos 9 + t 2 1 — t 2 


where n > 0. 

Solution: It follows that 


1 + Yl 2t m cos md = -1 + 2Re ^ e im9 t m = -1 + 2Rel/(l - te 10 ) 

m = 0 

= (1 -f 2 )/(l -2te + t 2 ), 


m= 1 


which the desired result. The next equation is found in the Fourier 
cosine series, where the coefficients can be obtained from 


9n. 


1 


p27T 


1 -t 2 
1 — 2tx + t 2 


cos n0d9 = 2t n . X 


5.4 Applications in Physics and Engineering 

5.4.1 Quantum- Mechanical State in an Harmonic Potential 

We now consider the application of Hermite polynomials H n (x) to physical 
systems in the theory of quantum mechanics. We know that H n (x ) satisfies 
the following second-order differential equation: 

H"{x) — xH' n (x) + nH n (x) = 0. 

Let us introduce the related function 


U n (x) = e 


/4 H n (x). 


(5.71) 
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A simple calculation shows that 

K(x) + (n + \ - U n (x) = 0. (5.72) 

This equation is similar in form to the Schrodinger equation for a quantum 
particle whose motion is confined to an harmonic potential well. In fact, the 
Schrodinger equation is given by 

iP"(x)+(e-^tP(x) = 0, (5.73) 


where ip( x ) is the quantum wave function whose squared value at the position 
x = a, namely, |t/>(a)| 2 , represents the probability density of the quantum 
particle being observed at x = a. The similarity between (5.72) and (5.73) 
implies that the product of the function defined by (5.71), i.e., H n {x ), and 
e~ x / 4 behaves as a wave function that describes the quantum particle in the 
potential well. 

However, it should be noted that solutions of (5.73) do not always satisfy 
the condition 

/ OO 

\ip(x)\ 2 dx < oo, (5-74) 


which must be satisfied for the solutions to be physically meaningful. By 
comparing (5.73) with (5.72), we see that whenever 


E = E n = 2n + 1 , 


(5.75) 


we have 

ip n {x) = c n e~ x2/2 H n (V2x^ , 

which clearly satisfies the condition (5.74) if the constants c n are chosen ap- 
propriately. Furthermore, the uniqueness theorem for solutions of ordinary 
differential equations (see Sect. 15.2.4) guarantees that the values of E given 
in (5.75) are the only ones for which (5.73) has solutions satisfying (5.75). 
These specific values of E are called the eigenenergies of the system, and 
the corresponding solutions psi n {x) are called eigenfunctions. 


5.4.2 Electrostatic potential generated by a multipole 

Next, we briefly discuss the use of Legendre polynomials in describing 
the electrostatic potential field generated by a multipole. For simplicity, we 
first consider an electric dipole, i.e., a pair of positive and negative charges 
separated by an infinitesimal distance h. We choose our coordinate system 
such that both charges are located on the x-axis with the negative charge at 
the origin. The magnitude of the charges is taken to be ±(1 /h). Then, the 
electrostatic potential field $ 2 (-P) with respect to a point P on the sphere 
x 2 + y 2 + z 2 = r 2 is represented as 
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Therefore, when r = 1, we have 

W )| r=1 = -* = -Pi(aO-l!, 

where P n (x) is a Legendre polynomial. 

Similar descriptions can be presented for lrigh-degree multipoles. The po- 
tential $±(P) of a quadrupole is determined as follows: Consider a double 
negative charge —(2/ft 2 ) located at the origin and two positive charges 1/ft 2 
located at the points ( x,y,z ) = (±ft, 0,0). Then, the associated potential 
$a{P) at a point on a sphere of radius r is given by 


<ft 4 (P) = lim ( , 1 

h—>0 ft - 2 \ y /{ x + h ) 2 +y 2 +Z 2 


c 2 + y 2 + z 2 \/ {x — ft) 2 + y 2 + z 2 


2 o 2 

r — 3x 


so for r = 1, 

<P 4 (P)\ r = 1 = -l + 3x 2 = P 2 (x)-2L 
Similarly, for an octapole, we get 

$s( p )\ r =i = ^3 = -15x 3 + 9x = —Ps(x) • 3!, 

and in general 

The final result tells us that the potential of a 2 n -pole is described by the 
product of the Legendre polynomial P n {x ) and the factor (— l) n -?r!. By solving 
the previous equation for P n {x), we obtain the following expression for the 
nth Legendre polynomial: 


P n ( x) 


(-1)” d n /1\ 
n\ dx n \r J r=1 
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Lebesgue Integrals 


Abstract The concept of “measure” (Sect. 6.1.2) is important for an understanding 
of the theory of the Lebesgue integral. A measure is a generalization of the concept of 
length that allows us to quantify the length of a set that is composed of, for instance, 
an infinite number of infinitesimal points with a highly discontinuous distribution. 
Thus, the Lebesgue integral is an effective tool for integrating highly discontinuous 
functions that cannot be integrated using conventional Riemann integrals. 


6.1 Measure and Summability 

6.1.1 Riemann Integral Revisited 

It is certain that the Riemann integral is adequate for practical applications 
to most problems in physics and engineering, as the functions that we usually 
encounter are continuous (piecewise, at least) so that they are integrable by 
the Riemann procedure. In advanced subjects in mathematical physics, how- 
ever, we come to a class of highly irregular functions where the concept of an 
ordinary Riemann integral is not applicable. In order to treat such functions, 
we have to employ another, more flexible integral than the Riemann integral. 
In this chapter, we present a concise description of the Lebesgue integral. 
The Lebesgue integral not only overcomes many of the difficulties inherent in 
the use of the Riemann integral, but its study has also generated new concepts 
and techniques that are extremely valuable in practical problems in modern 
physics and engineering. 

At first, the cultivation of an intuitive feeling for the Lebesgue integral 
as an adjunct to formal manipulations and calculations is important, and we 
achieve this by comparing it with the Riemann integral. When defining the 
Riemann integral of a function f(x) on an interval I = [a, b], we divide the 
entire interval [a, b] into small subintervals Axk = [xk,Xk+i] such that 


a = x i < X2 <■■■ < x n+ i = b. 
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The finite set {x{\ of numbers is called a partition P of the interval I. Using 
this notation P, let us define, e.g., the sums 

n n 

S P (f) = ^2,M k (x k+ 1 - x k ), s P (f) = y^m fc (a;fc + i - x k ), 
fc = i fe=i 

where M k and m k are the supremum and infimum of f(x) on the interval 
Ax k = [x k ,Xk+i], respectively, given by 

M k = sup f(x), m k = inf f(x). (6.1) 

X(zAxk XdzAXk 

Evidently, the relation Sp(f) > sp(/) holds if the function f(x) is bounded 
on the interval I = [a, b}. We take the limit inferior (or limit superior) of the 
sums, 

S(f) = liminf Sp, s(f) = limsup sp, (6-2) 

n— ^o° n— »oo 

where all possible choices of the partition P are taken into account. The S(f) 
and s(f) are called the upper and lower Riemann Darboux integrals of 
/ over I, respectively. If the relation holds, i.e., if 

S(f) = s(f) = A, 

the common value A is called the Riemann integral and the function f(x) 
is called Riemann integrable such that 

A= f f(x)dx. 

J a 

We note without proof that the following conditions ensure the existence of 
the Riemann integral of a function f(x). 

1. f(x) is continuous in I = [ a,b ]. 

2. f(x) has only a finite number of discontinuities in I = [a, b\. 

On the other hand, when the function /( x) exhibits too many points of 
discontinuity, the above definition is of no use in forming the integral. An 
illustrative example is given below. 

Examples Assume an enumeration {z n } (n = 1, 2, • • • ) of the rational numbers 
between 0 and 1 and let 

f( r ) = f 1 ( x = zi,z 2 ,--- ,z„) 

' (0 otherwise. 

That is, the function f(x) has the value unity if x is rational and the value 
zero if x is irrational. In any subdivision of the interval Ax k C [0,1], 


m k = 0, M k = 1, 

and 

Sp = 0, Sp = 1. 

Therefore, the upper and lower Darboux integrals are 1 and 0, respectively, 
whence f(x) has no Riemann integral. 
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6.1.2 Measure 

The shortcoming of the Riemann procedure demonstrated above can be suc- 
cessfully overcome by employing Lebesgue’s procedure. The latter requires a 
systematic way of assigning a measure n(Xi ) to each subset of points V,;. 
In the remainder of this section, we learn about the basic properties of mea- 
sure and its relevant materials, which serve as preliminaries to introduce the 
precise definition of Lebesgue integrals given in Sect. 6.2. 

The measure for a subset of points is a generalization of the concepts of the 
length, area, and volume. Intuitively, it follows that the length of an interval 
[a, b] is b — a. Similarly, if we have two disjoint intervals [ai,&i] and [ 02 , 62 ], 
it is natural to interpret the length of the set consisting of these two intervals 
as the sum ( 6 -| — ai) + (b 2 — a 2 ). However, the ‘length’ of a set of points 
of rational (or irrational) numbers on the line is not obvious. This context 
requires a rigorous mathematical definition of a measure of a point set, as 
shown below. 


4 Measure of a set of points: 

A measure fi(X) defined on a set of points A is a function with the 
following two properties: 

1. If the set X is empty or consists of a single point, n{X) = 0; otherwise, 
li(X) > 0. 

2. The measure of the sum of two nonoverlapping sets is equal to the 
sum of the measures of these sets expressed by 

f i{X 1 + X 2 ) = n(X!) + /i(A 2 ) for X 1 nX 2 = 0. (6.3) 

In the above statement, X\ + X 2 denotes the set containing both elements of 
X\ and X' 2 , wherein each element is counted only once. If X\ and X 2 overlap, 
(6.3) is replaced by 

/r(Ai + X 2 ) = n(Xi) + fi(X 2 ) — fi(Xi fl X 2 ) 

so that the points common to X\ and X 2 will be counted only once. 

Various kinds of measures have been thus far introduced in mathematics. 
Among them, is the following important example of measure that plays a 
central role in the subsequent discussions. Consider a monotonic increasing 
function a(x) and let / be an interval (open or closed) with endpoints a and b. 
We define the a- measure of I denoted by /r Q (/), which takes different values 
depending on the types of endpoints a and b as shown below. 


4 a-measure of intervals: 

a-measure of intervals are defined by 

• Mce ( [«, b] ) = a( 6 + ) — a(a~) for the closed interval [a, 6 ], 

• n a ( (a, b] ) = a(b + ) — a(a + ) for the semiclosed interval (a, 6 ], 
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• Ha{ [a, b) ) = a(b ~ ) — a(a~) for the semiclosed interval [a, b), 

• Ma ( ( a > b) ) = a(b~) — a(a + ) for the open interval (a, b), 

where a(a~) = lim a (a — s) and a{a + ) = lim a (a + e). 

£—>0 £ — ► () 

By definition, the open interval (a, a) is an empty set, so that a)) = 0 

for any a £ R. The other cases of intervals (a, a] and [a, a) are also empty 
sets. Note that n a {I) > 0 since a(x) is a monotonically increasing function. 

Examples Let a(x) be the monotonically increasing function (see Fig. 6.1) 

( 0, x < 1, 

a{x) = < x = l, (6.4) 

I 1, x >1. 


We then have 


p, a { [0, 1) ) = a(l")-a(0") = 0-0 = 0 


and 


Ma ( [0,1] ) = a(l+)-a(0") = 1-0=1. 


Similarly, 


Ma ( [1)2] ) = p a ( [1> 2) ) =2-0 = 2, 

/z a ((l,2]) = (1, 2) ) = 2-1 = 1. 


a(x) 


l a 


1/2 




1 O >x 

0 1 

Fig. 6.1. The function a(x) defined in (6.4) 


6.1.3 The Probability Measure 

The significance of measure is understood by illustrating the probability the- 
ory as an example. Probability theory deals with statistical properties of a 
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random variable x associated with an event occurring sequentially or simul- 
taneously, where it is assumed that the average of x approaches a constant 
value as the number of observations increases. 

Given a random variable x, its expected (or mean) value is defined by 
the integral 

/ OO 

xp(x)dx, (6-5) 

-OO 


where p(x) > 0 is the probability density function of the random variable 
x defined by 


p{x) 


dP{x ) 
dx ’ 


with the probability distribution function P(x). The function P(x) de- 
scribes the probability that the event labeled x occurs. It follows intuitively 
that 

rx 2 

P{x i < x < X 2 } = / p(x)dx (6.6) 

J X\ 


and 


/ OO 

p(x)dx = 1. 

-OO 


Examples For a discrete random variable {a;,}, the integral of (6.5) can be 
written as a sum: 

E{x } = Yx iPi . 

i 

In an experiment with dice, e.g., the probability of each event is given by 

1 

Pi =P2 = ■■■ =Pe = g, 

which yields 

6 ' j 

E{Xi} = Y X iPi = 3 - 


In probability theory, the probability distribution function P{x) plays the 
role of measure. Assume a set of continuous real numbers, X = {x < a} and 
let the function a(a) be the probability that x has a value no greater than a. 
The function a(a ) then reads 


a (a) = P(x < a), (6-7) 

where a(— oo + ) = 0 and a(oo“) = 1. Note that a (a) is a monotonically 
increasing function. We have as well 

P{x 1 < x < X 2 } = a(x 2 ) — a(x 1 ), 
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since 

P{x < X 2 } = P{x < 24} + P{x 1 < x < £2}. 

Therefore, we see that the probability distribution function P{x G I) corre- 
sponds to the a-measure for any interval I, as expressed by 

li a (I) = P(x G I), 

which behaves as 0 < p a (I) < 1 for any I. 


Remark. The mean value (6.5) of a random variable x can be interpreted as a 
Riemann Stieltjes integral, rather than as an ordinary Riemann integral. 
To see this, we observe that the Riemann integral (6.5) can be expressed by 
the Riemann sum as 


rp(x)da 


OO 

y: £kp{tk){x k + \-x k ), 

k= — oo 


( 6 . 8 ) 


where is any point on Axk ■ Since p(xk){xk+ 1 — Xk) = AP{xk < x < 
from (6.46), the mean value is written in the form 

/ oo /»oo 

xdP = / xdp( x), (6-9) 

-OO J — OO 

which is called the Riemann-Stieltjes integral of x with respect to p(x). 


6.1.4 Support and Area of a Step Function 

What follows is an important concept that we use together with the concept 
of measure to introduce the definition of the Lebesgue integral. Let be any 
interval, and suppose that the step function 9(x) given by 

n( T \ / ^ii X G i 1, 2, • • • , 77 ., 

'' (0, otherwise, 

where a set {ci,C2 ,--- ,c n } consists of finite and real numbers. We see that 
6 is constant on each interval and zero elsewhere. We now introduce the 
following concept: 

6 Support of a step function: 

The disjoint set S = I\ U J2 U • • • U I n C I on which 9 is nonzero is called 
the support of 9{x). 
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An example of the support 0(x) is depicted in Fig. 6.2. When the support 
of a step function 9 has a finite total length, we associate it with the area 
A(9) between the graph of 6 and the x-axis, with the usual rule that areas 
below the x-axis have a negative sign. We refer to A(9) as the area under 
the graph of 6 . 


6(x) 



Fig. 6.2. The disjoint set S = Ii U I 2 U • • • that serves as the support of 9(x) 


Concepts such as support and area can apply to a linear combination of 
step functions. Suppose that 9 1 , 62 ,- '■ , 0n are step functions on the same 
interval J, all with supports of finite total length, and that 01 , a 2 , • • • , a n are 
finite real numbers. Then, the function 0{x) defined by 

n 

0(x) = ^ aj9j(x) for x € I 
i=i 

is also a step function on I. The support of <9(x) has a finite length and the 
area under the graph of (9(x) is given by 


A(G) = Y,a j A(9 j ). 

3 = 1 


Examples Let 0\, 62 : [0, 3) — > R be defined by 


9 i(x) 


1 

2 


for [0,2), 
for [2,3), 


02 (a:) 


Let 0 = 20i - 0 2 . Then 


-1 for [0,1], 

1 for (1,3). 


( 3 for [0,1], 

0 {x) = < 1 for (1,2), 

[3 for [2,3). 


( 6 . 10 ) 


( 6 . 11 ) 


These are plotted in Fig. 6.3. Clearly 0 is a step function. Note also that the 
areas are 


^(0i) = 2(1) + 1(2) = 4, A{9 2 ) = -1(1) + 2(1) = 1, 
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and 

A{0) = 1(3) + 1(1) + 1(3) =7 = 2,4(00 - A(0 2 ). 




Fig. 6.3. The functions 8\{x), 02(x), 0(x) given in (6.10) and (6.11), respectively 


6.1.5 a-Summability 

Now, we combine the concepts of a-measure and support of a step function. 
Let a(x) be a monotonically increasing function, I be any interval, and 9( x) 
be a step function. We further assume that the support of 9 is a simple set, 
i.e., the union of a finite collection of disjoint intervals. For example, the set 
S = Ufc=i Ik is a simple set if ij , I 2 , - ■ ■ , are disjoint intervals. Then, the 
a-measure of S is given by 


n 

AO(^) ^ ' Ma(-^fc)- 

fc= 1 

Observe that the value of /.i a ( S ) is independent of the way in which the set S 
is subdivided. Note also that 

(i) n a {S) > 0 for any simple set S, and 

(ii) if S and T are simple sets such that S C T, then n a (S) < n a {T). 

We are now ready to present the following statement: 


4 a-summability: 

A step function 9{x) is a-summable if the support of 9 has a finite 
a-measure with respect to a given monotonically increasing function a(x). 
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Given an a-summable step function 9{x), we associate it with a real number 
A a {9) defined by 

n 

A a {9) = Y,Wa{Ik), ( 6 . 12 ) 

fc=i 

where Ck is the amplitude of step function 9{x) for x £ Ik- In general, A a (9) 
can be thought of as a generalized area. For example, when setting a{x) = x, 
the measure Ha{Ik) turns out to be just the ordinary length of the interval 
Ik, then A a (9) is just the area A(6) under the graph of 9j as defined in 
Sect. 6.1.4. However, if a(x) has a more complicated function form, we get a 
different value of A a (9) from the above since in that case a length along the 
£-axis should be measured by the a-measure rather than by ordinary length. 
An example of an actual calculation of A a (9) is provided in Exercise 2 . 

I Remark. We shall see in Sect. 6.2.2 that the Lebesgue integral is defined by 
the limit n — > oo of the sum in (6.12). 


6.1.6 Properties of a-summable functions 

We list some basic properties of a-summable step functions without proof. 

• If 9{x) is a nonnegative a-summable step function with respect to a given 
a(x), then A a {9) > 0 and A a (0) = 0. 

• If 9\ and 02 are a-summable step functions on the same interval I such 
that 0i < 02 on I, then A a {9\) < A Q (0 2 ). 

• Let a set { 9 m } be a-summable step functions on the same interval /, and 
let {a m } be finite real numbers. By defining 9 : I R as 

m 

0(x) =^Z a A( x ) 

j~ i 

for all x £ I (0 is also an a-summable step function on /), we have 

m 

A a (9) = ajA a (9j). 

3 = i 


Exercises 

1. Assume a monotonically increasing function a(x) defined by 

! 0, x £ (-oo,l), 

x 2 — 2x + 2, x £ [ 1 , 2 ), 

3 , X = 2 , 

x + 2, x £ (2, oo). 
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Calculate A a {9) for each of the two step functions: 


and 


0i 0*0 


02(x) 


-1, 

X G [0, 1), 

2, 

x G [1,3], 

-1, 

x G [0, 1], 

2, 

x G (1, 3]. 


Solution: Since 

H a { [0, 1) ) = a(l") - a(CT) =0-0 = 0, 

H a { [1, 3] ) = a(3+) - a{ 1") = 5-0 = 5, 

we have 

4*(0i) = (-1)0 + 2(5) = 10. 

For 02, on the other hand, we have a different result since 

Ma( [0, 1] ) = a(l+) - a( 0") = 1-0 = 1, 

Ma( (1, 3] ) = a(3+) - a(l + ) = 5-1 = 4, 

which yields 

4,(02) = (- 1)1 + 2(4) = 7. 

It is noteworthy that the values of A a (0\) and A a (9 2 ) are different, 
although the area A{9) for them is the same. The difference comes 
from the fact that a has a discontinuity at the single point where 
9 1 and 6 2 have different values. £ 


2. Evaluate A a (9) of the step function: 

9{x) = 


2, x € (—oo, 0], 

1, x € (0, oo), 


which is associated with the a- measure: 

a(x) = 


0, x < 0, 

2 , X 0, 

1, x > 0. 


Solution: Since 

t+((— oo, 0]) = a(0 + ) - a(-oo + ) = ^ - 0 = ^, 
H a ((0,oo)) = a(oo") - a(0 + ) = 1 - ^ 
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we have 

.!„(«,) = l (})+ 2 (})=!• * 


3. Show that the function 


/( x) = lim lim (cos 2i:mlx) n , 

n — kx) m — >-oo 

called Dirichlet’s function, takes the form 

,, , J 1 for all rational numbers x, 

' (0 otherwise. 

Solution: When r is a rational number, it is expressed by a 

fraction p/q with relatively prime integers p and q. Hence, for 
sufficiently large m, the product mix becomes an integer since 

mix = to • (to — 1) • • • (q + 1) • p ■ (q — 1) • • • 2 • 1. 


Thus we have cos2iTmlx = 1. Otherwise, if x is an irrational num- 
ber, mix is also an irrational for any to, so that | cos 27r?n!a;| < 1. 
As a result, we obtain 


lim lim (cos 2'Kmlx) n 

n — >-oo m — »oo 


1 : x is a rational, 

0 : x is an irrational. 


6.2 Lebesgue Integral 

6.2.1 Lebesgue Measure 

The Lebesgue integral procedure essentially reduces to finding a measure for 
sets of arguments. In particular if a set consists of too many points of discon- 
tinuity, we need a way to define its measure that is known as the Lebesgue 
measure. It this subsection, we explain how to construct the Lebesgue mea- 
sure of a point set. 

As a simple example, let us consider a finite interval [a, 6] of length L. 
This can be decomposed into two sets: a set X consisting of some of the 
points x £ [a, b] and its complementary set X' consisting of all points 
x £ [a, b] that do not belong to X. A schematic view of X and X ' is shown 
in Fig. 6.4. Both X and X' may be sets of several continuous line segments 
or sets of isolated points. 

We would like to evaluate the measure of X. To do this, we cover the set 
of points X by nonoverlapping intervals Aj C [a, b] such as 


X c (Ai + A 2 -\ ). 
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♦ 


a 


O 


a 


♦ 


X 




* 


♦ — I • x 

\ b 


x 


X' 


b 


Fig. 6.4. A set A' and its complementary set X 1 


If we denote the length of Ak by 4, the sum of Ik must satisfy the 
inequality 

o<^4<£- 

k 

In particular, the smallest value of the sum is referred to as the outer 

measure of X and is denoted by 

l-iout(X) = inf | ^Jk 

V k 

In the same manner, we can find intervals Ak C [a,b\ of lengths £\ , 4 , • • • 
that cover the complementary set X' such that 

x'c (a; + 4 + •••), o <^e k <L. 

k 

Here we define another kind of measure denoted by 

IMn(X) = L — Hout{X') = L- inf , (6.13) 

which is called the inner measure of X. Note that the inner measure of X is 
defined by the outer measure of X', not of X. It is a straightforward matter 
to prove the inequality 



0 < Min(X) < /J 0 ut(X). 


(6.14) 


Specifically, if 

b^in(X) — /J'out(X) , 

it is called the Lebesgue measure of the point set X, denoted by fi(X). 
Clearly, when X contains all the points of [a, b], the smallest interval that 
covers [a, b] is [a, b] itself, and thus n(X) = L. 
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Our results are summarized below. 


4 Lebesgue measure: 

A set of points X is said to be measurable with the Lebesgue mea- 
sure /J.(X) if and only if jn in (X) = fu 0 ut(X) = n{X). 


Remark. An unbounded point set X is measurable if and only if (— c, c) fl X is 
measurable for all c > 0. In this case, we define /x(A) = lim^oo /x [(— c, c) f! A], 
which may or may not be finite. 


6.2.2 Definition of the Lebesgue Integral 

We are now in a position to define the Lebesgue integral. Let the function 
/( x) be defined on a set X that is bounded: 

0 < /min < f(x) < /max- 

We partition the ordinate axis by the sequence {fk} (1 < k < n) so that 
/ 1 = /min and f n = / m ax- Owing to the one-to-one correspondence between x 
and /(#), there should exist sets Xj of values x such that 

fk<f(x)<fk+ 1 for x € Xk (1 < k < n — 1), (6.15) 

as well as a set X n of values x such that f(x) = /„ . Each set Xk assumes a 
measure /i(Afe). Thus we form the sum of products fk ■ /x(Xfc) of all possible 
values of /, called the Lebesgue sum: 

n 

(6.16) 

fc = i 

If the sum (6.16) converges to a finite value when taking the limit n — * oo 
such that 

max | fk - fk+ 1 | 0, 

then the limiting value of the sum is called the Lebesgue integral of /( x) 
over the set X. 

The formal definition of the Lebesgue integral is given below. 

6 Lebesgue integral: 

Let /( x) be a nonnegative function defined on a measurable set X 
and divide X into a finite number of subsets such as 

X = X 1 + X 2 + --- + X n . (6.17) 
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Let fk = \nt xe x k f{x) to form the sum 

n 

y~i f k AXk). 

fc=i 

Then the Lebesgue integral of f(x) on X is defined by 



n 


lim 

max| fk—fk — l\ — *0 


_k = 1 



where all possible choices of partition (6.17) are considered. 


(6.18) 


Figure 6.5 is a schematic illustration of the Lebesgue procedure. Obviously, 
the value of the Lebesgue sum (6.16) depends on our choice of partition. 
If we take an alternative partition instead of (6.17), the value of the sum 
also changes. Among the infinite variety of choices, the partition that max- 
imizes the sum (6.17) gives the Lebesgue integral of f(x). That a function 
is Lebesgue integrable means that the limit superior of the sum (6.18) is 
determined independently of our choice of the partition of the x-axis. 



" o* • * • 0 


' 4 . 

Fig. 6.5. An illustration of the Lebesgue procedure 


6.2.3 Riemann Integrals vs. Lebesgue Integrals 

Before proceeding further with this discussion, we compare the definitions of 
Riemann and Lebesgue integrals for a better understanding of the significance 
of the latter. In the language of measure, the Riemann integral of a function 
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f(x) defined on the set X is obtained by dividing X into nonoverlapping 
subsets X, as 

X = X\ + X 2 + ■ ■ ■ + X n , XiCiXj = 0, for any i, j, 
followed by setting the Riemann sum 

n 

(6.19) 

fc= 1 

Here, the measure n(X k ) is identified with the length of the subset X k , and 
£k assumes any point that belongs to Xk- We increase the number of subsets 
n — ■> 00 such that 

M-Xfc) ->■ 0 for any X k , 

and if the limit of the sum (6.19) exists and is independent of the subdivision 
process, it is called the Riemann integral of /( x) over X. Obviously, the 
Riemann integral can be defined under the condition that all values of /( x) 
defined over X k tend to a common limit as ^{X k ) — > 0. Such a requirement 
excludes any possibility of defining the Riemann integral for functions having 
too many points of discontinuity. 

Remark. In view of the analogy between the sum (6.12) and (6.18), we may 
say that, in a sense, the Lebesgue integral is the limit n — * 00 of the quantity 
A a {6). 

Although the Lebesgue sum given in (6.16) is apparently similar to the 
Riemann sum given in (6.19), they are intrinsically different. In the Riemann 
sum (6.19), /(£i) is the value of /( x) at an arbitrary point £ X t . Thus the 
value of is allowed to vary within each subset, which causes an indefiniteness 
in the value of /(£i) within each subset. On the other hand, in the Lebesgue 
sum (6.16), the value of /) corresponding to each subset X t has a definite 
value. Therefore, for the existence of the Lebesgue integral, we no longer need 
local smoothness of f(x). As a result, the conditions imposed on the inte- 
grated function become very weak compared with the case of the Riemann 
integral. 


6.2.4 Properties of the Lebesgue Integrals 

Several properties of the Lebesgue integral are given below without proof. 

1. If f(x) is the Lebesgue integrable on X and if X = X\ + X 2 + ■ ■ ■ + X n , 
then 


/ . fd 'H = fdn- 

IX JXi 
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2. If two functions /(x) and g(x) are both Lebesgue integrable on X and if 
/(x) < g(x) for any x £ X, then 


fdn < / gd/i. 


/ x 


>X 


3. If n(X) = 0, then f x f{x)dx = 0. 

4. If the integral f x f(x)dx is finite, then the subset of X defined by 

X' = {x | f(x) = ±oo} 


5. 


has zero measure. This means that in order for the integral to converge, 
the measure of a set of points x at which /( x) diverges is necessarily zero. 
Suppose that f x f{x)dx is finite and that X’ C X. If we make n(X') — * 0, 
then 



0 . 


6. When f(x) on X takes both positive and negative values, its Lebesgue 
integral is defined by 


and 


where 


and 


/ fdl-i = I 

f + dg,+ [ f d^i 

(6.20) 

Jx J X 

Jx 


[ \fW= ( 

,f + dg- [ f~d(j,, 

(6.21) 

Jx Jx 

Jx 


/ + M = { 0 /W 

for {x; f(x) > 0}, 
for {x; f(x) < 0}, 


rw = {-/(*) 

for {x; f(x) > 0}, 
for jx; /(x) < 0}. 



Definition (6.21) is justified except when both integrals on the right-hand 
side diverge. 


6.2.5 Null-Measure Property of Countable Sets 

Let us show that any countable set has a Lebesgue measure equal to zero. A 
rigorous definition of countable sets is given herewith. 

4 Countable set: 

A finite or infinite set X is countable (or enumerable) if and only if 
it is possible to establish a reciprocal one-to-one correspondence between 
its elements and the elements of a set of real integers. 
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It follows that every finite set is countable and that every subset of a countable 
set is also countable. Any countable set is associated with a specific number, 
called the cardinal number, defined below. 

4 Cardinal numbers: 

Two sets X\ and X 2 are said to have the same cardinal number if and 
only if there exists a reciprocal one-to-one correspondence between their 
respective elements. 


I Remark. A set X is called an infinite set if it has the same cardinal number 
as one of its subsets; otherwise, X is called a finite set. 

It should be stressed that an infinite set may or may not be countable. When 
a given infinite set is countable, then its cardinal number is denoted by Ho, 
which is the same as the cardinal number of the set of the positive real integers. 
Furthermore, the cardinal number of every noncountable set is denoted by H, 
which is identified with the cardinal number of the set of all real numbers (or 
the set of points on a continuous line). Cardinal numbers of infinite sets, Ho 
and H, are called transfinite numbers. 

The most important property of countable sets in terms of measure theory 
is given below. 

4 Theorem: 

Any countable set (finite or infinite) has a Lebesgue measure of zero, 
namely, null measure. 


Examples An illustrative example is the set of rational numbers that has 
measure zero as shown earlier. The countability of this set follows from the 
fact that it can be arranged in a sequence of proper fractions as 
112131234 
°’ lj 2’ 3’ 3’ 4’ 4’ 5’ 5’ 5’ 5’ 

Accordingly, since the set of all rational numbers in the interval [0, 1] has zero 
measure, the Lebesgue integral of Dirichlet’s function x( x ) over this interval 
is well defined and equal to zero. 

Another well-known example of the set of measure zero is the Cantor set, 
which is demonstrated in Exercise 2. 


6.2.6 The Concept of Almost Everywhere 

We have observed that sets of measure zero make no contribution to Lebesgue 
integrals. This fact provides a concept of an equality almost everywhere 
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for measurable functions, which plays an important role in developing the 
theory of function analysis. 

4 Equality almost everywhere: 

Two functions f(x) and g(x) defined on the same set X are said to be 
equal almost everywhere with respect to a measure g{X) if 

g{x e X ; f(x) ± g{x)} = 0. 


We extend this terminology to other circumstances as well. In general, a prop- 
erty is said to hold almost everywhere on X if it holds at all points of X 
except on a set of measure zero. Thus two functions fix) and g(x) are said to 
be equivalent (written / ~ g) if they coincide almost everywhere. For exam- 
ple, Dirichlet’s function mentioned earlier is equivalent almost everywhere to 
the function g(x) = 0. 

Since the behavior of functions on sets of measure zero is often unimpor- 
tant, it is natural to introduce the following generalization of the ordinary 
notion of the convergence of a sequence of functions: 

6 Convergence almost everywhere: 

A sequence of functions {/ n (x)} defined on a set X is said to converge 
almost everywhere to a function f(x) if 

lim f n (x) = f{x) (6.22) 

n — »oo 

for all x G X except for points of measure zero. 


Examples A typical example is the sequence 

{/»(*)} = {(-*)"} 

defined on [0,1]. It converges almost everywhere to the function f{x) = 0; in 
fact it converges everywhere except at the point x = 1. 


Exercises 

1. Show that the set of all rational numbers in the interval [0,1] has a 
Lebesgue measure equal to zero. 

Solution: Denote by X' the set of irrational numbers that is com- 
plementary to X and the entire interval [0, 1] by I. Since g{I) = 1, 
the outer measure of X' reads 

gout(x ) = go ut{i X) = /i ou t(-0 gout(x) = 1 fj’out(x). 
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By definition, the inner measure of X is given by 
MinPO = IMnW - MoutPO = 1 - [1 - MoutPO] = MoutW. 


The last equality asserts that the set X is Lebesgue measur- 
able). The remaining task is to evaluate the value of fi(X) = 0. 

Let Xk (k = 1, 2, • • • , n, ■ ■ ■ ) denote the points of rational num- 
bers in the interval I. We cover each point x\, X 2 , ■ ■ ■ , x n , ■ ■ ■ by 
an open interval of length e/2, e/2 2 , • • • , e/2 n , • • •, respectively, 
where e is an arbitrary positive number. Since these intervals may 
overlap, the entire set can be covered by an open set of measure 
not greater than 


OO 



n= 1 


2 ( 1 - |) " 


Since e can be made arbitrarily small, we find that /-i ou t(X) = 0. 

Hence, from (6.18) we immediately have fi(X) = 0. 4k 
2. Evaluate the measure of a Cantor set, an infinite set constructed as fol- 
lows: (i) From the closed interval [0, 1], delete the open interval (1/3, 2/3) 
that forms its middle third; (ii) from each of the remaining intervals 
[0,1/3] and [2/3,1] delete the middle third; (iii) continue this process 
of deleting the middle thirds indefinitely to obtain the point set on the 
line that remains after all these open intervals. 


Solution: Observe that at the fcth step, we have thrown out 2 k ~ 1 
adjacent intervals of length l/3 fc . Thus the sum of the lengths of 
the intervals removed is equal to 


1 

3 


+ 




2 n ~ 1 

3" 


lim 


§[i-(§r 


i - 


= i. 


This is just the measure of the open set P' that is the comple- 
ment of P. Therefore, the Cantor set P itself has null measure 


H{P) = 1 - fi(P') = 1-1 = 0. 4k 
3. Show that if f(x) is nonnegative and integrable on X, then 


n[xe x, f(x) > c ] < 1 [ , 
c Jx 


r fdn , 

which is known as, Chebyshev’s inequality. 

Solution: Set X' = {x £ X , /(c) > c} to observe that 


/ .fd/i = / fdn + 
lx J X' 


/ fdn > / fdn > c/x(X'). 4k 
Ix-X' J X' 
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4. Show that if j x \f\dfi = 0, then f(x) = 0 almost everywhere. 


Solution: By Chebyshev’s inequality, 


x G X, \f(x)\ > - 
n 


<n f \f\dfj, = 0 
Jx 


for all n = 1, 2, • • • . Therefore, we have 


lx[xGX, /( x) ± 0] < M 


x £ X, \f(x)\ > 


0. A 


6.3 Important Theorems for Lebesgue Integrals 

6.3.1 Monotone Convergence Theorem 

Our current task is to examine whether or not the equality 

f n {x)dx = f f(x)dx (6.23) 

J a 

is valid under the Lebesgue procedure. This problem can be clarified by 
referring to two important theorems concerning the convergence property of 
Lebesgue integrals; the monotone convergence theorem and the dom- 
inated convergence theorem. Neither theorem is valid if we restrict our 
attention to Riemann integrable functions. We observe that, owing to the 
two convergence theorems, Lebesgue theory offers a considerable improvement 
over Riemann theory with regard to convergence properties. 

In what follows, we assume that X is a set of real numbers, and that {/„} 
is a sequence of functions defined on X. 



4 Monotone convergence theorem: 

If (/„) is a sequence such that 0 < /„ < f n+ \ for all n > 1 in X and 
/ = lim n — >00 f n , then 


lim 

n — >oo 



fndu 




lim f n dn 

Tl — »oo 



Remark. The monotone convergence theorem states that in the case of 
Lebesgue integrals, the conditions to reverse the order of limit and integration 
are much weaker than in the case of Riemann integrals; i.e., only the point- 
wise convergence of f n (x ) to f(x) is required in the Lebesgue case, whereas 
in the Riemann case we must have uniform convergence of f n (x ) to f(x). 
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Proof (of the monotone convergence theorem): The hypothesis 
0 < fn < fn+1 implies that 

0 < [ fndu < [ fn+ldfj,, 


IX 


IX 


which indicates that the sequence { f x f n dix} increases monotonically 
with respect to n; thus its limit n — > oo exists as a we denote it by M 
(possibly equal to oo). In addition, by hypothesis 


f n dfx < / fd/x, for all n. 


(6.24) 


IX 


IX 


Since (6.24) is true for arbitrary n, we have 


M = lim 

n—* oo 


/ fndfi < / fd/x. 

lx J Jx 

Therefore, if we can verify the opposite inequality 


M > / fd/x, 

Jx 


(6.25) 


/ lim f n dn = / fd/x. 

'x n ^°° Jx 


we will get the desired result, 

M = lim / f n d/-i = 

n^oo j x 

To show (6.25), let c be a number such that c £ (0, 1) and introduce 
the point set 

X n = {x : c/( x) < fn(x)}. 

Owing to the monotonically increasing property of the sequence 
{f n (x)} with regard to n, the set X n satisfies the inclusion relation 

OO 

IiCl 2 Cl 3 C- and U X n = X. 

n = 1 

In addition, the increasing property of the sequence { f x ,f n dix} yields 


/ fdfi < / f n d/x < lim 
/ v Jx n n ^°° 


fndfl 


UX n 


= M. 


(6.26) 


Since (6.26) must hold for any n, we have 

c [ fdn < M. (6.27) 

Jx 

Furthermore, since (6.27) is true for all c £ (0, 1), we have 

[ fd/x < M. (6.28) 

Jx 

Note that the substitution c = 1 into (6.27) is allowed because the 
symbol <, not <, is involved in (6.27). £ 
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6.3.2 Dominated Convergence Theorem (I) 

In the previous argument, we saw that the order of limit and integration can 
be reversed when considering monotonically increasing sequences of functions. 
In practice, however, the requirement in the monotone convergence theorem, 
i.e., that the sequence {fn(x)} must be monotone increasing, is sometimes 
very inconvenient. In this subsection, we examine the same issue for more 
general sequences of functions, i.e., nonmonotone sequences satisfying some 
looser conditions and their limit passage. Our current objective is to prove 
the theorem below. 


4 Dominated convergence theorem: 

Let {/„} be a sequence of functions for almost everywhere on X such 
that (a) linin^oo f n (x ) = f(x), and (b) there exists a nonnegative g such 
that |/n| < g for all n > 1. Then, we have 

lim / f n dfi = / fd/L. 

n ^°° Jx Jx 


Remark. Note that the condition imposed on the theorem above is that the se- 
quences {/„} should be bounded almost everywhere. This condition is clearly 
looser than that imposed in the monotone convergence theorem. Hence, the 
monotone convergence theorem can be regarded as a special case of the dom- 
inated convergence theorem. 


6.3.3 Fatou Lemma 

The proof of the dominated convergence theorem requires the lemma given 
below. 


4 Fatou lemma: 

If fn{x) > 0 for all n and for almost everywhere in a bounded measurable 
set X and if limn^oo f n (x) = f{x), then 



lim inf /„ 

. n—>oo 


dfi 



fdfi < lim inf 

n — »oo 



where the definition is 


lim inf f n 

n — »oo 


lim inf f k . 

n—> oo k>n 
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Proof Let g n = inffc>„ f k . Since the sequence g n {x) is nonnegative and non- 
decreasing, we have 

lim g n = liminf f n . 

n—> oo n—* oo 

(See Sect. 2.1.4 for the precise definition of liminf.) In addition, the monotone 
convergence theorem implies that 


lim / g n d/-i = / lim g n dg = / lim inf f n dg. (6.29) 


ix 


ix 


ix 


It also follows that 


Hence, 


gn{x) < fk(x) for any k > n. 


g n dg. < / fkdg for any k > n, 


ix 


lx 


that is, 


/ gndyi < inf / f k dg. 

J X k>nJ X 

Taking the limit n — » oo and applying the monotone convergence theorem, 
we get 


lim / g n df-i < lim 

n —* oo J x n —> oo 


inf / f k dg, 


k> 


± n JX 


= lim inf / f n d/i. 

IWO ° J x 


(6.30) 


From (6.29) and (6.30), we conclude that 


ix 


lim inf f n dg, 

n — »oo 


fd/i < liminf / f n dfi. 


ix 


ix 


* 


6.3.4 Dominated Convergence Theorem (II) 

Our next task is to prove the dominated convergence theorem. 

Proof Observe that f n and / are Lebesgue integrable on X. From hypothesis, 
it follows that f n + g > 0 and g — f n > 0 almost everywhere. Thus by Fatou’s 
lemma, we have 



liminf ( f n + g) dfi, < liminf 

n— >■ oo n— >■ oo 



( fn + g) d H 


or 



liminf f n d(i < liminf 

n—*oo n—* oo 



fndg 


(6.31) 


by the linearity of the Lebesgue integral. It is also true that g — f n > 0 on X : 
thus also by Fatou’s lemma we have 
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/ lim inf (g - /„) dfi < lirn inf / (g - f n ) dfi, 
J X n^oo n — »oo J x 


or equivalently, 


— / lim inf f n dfi < lim inf 

J X n—*oo 


- / fn 
J X 


dfi. 


The latter inequality can be rewritten as 


/ lim inf f n dfi > lim sup / fndfi. 

lx n ^°° ra— too Jx 


From (6.31) and (6.32) we set 


/ lim inf f n dfi < lim inf / f n dfi 

lx n — ^oc n — »oo J x 


< lim sup / fndfi < / lim inf f n dfi, 

n — too Jx JX n ^°° 


which clearly indicates that 


(6.32) 


lim inf / f„d/i = lim sup / fndfi, 

n— too J x n — too J X 


so that the limit lining,*, f x f n dfi exists and is equal to f x lim n ^ 00 f n dfi = 
j x fdfi. This completes the proof of the theorem. £ 


6.3.5 Fubini Theorem 


For a function of several variables, we may define the Lebesgue integral by 
exactly the same process as for a function of one variable. In cases of two 
variables, for instance, a rectangle S = [a, b] x [c,d\ takes on the role of inter- 
vals, and we need only to imitate the definitions and methods that we used 
for functions of a single variable. We can develop the theory for the entire 
plane R 2 analogously to that for the real axis R. In fact, all the consequences 
in Sect. 6.2 for Lebesgue integrable functions on a closed interval [a, 6] are 
easily carried over to the corresponding propositions for the double integral 
on the rectangle S without modifying the actual proofs in Sect. 6.2, except 
for replacing f(x) by f(x, y). 

However, an important new problem arises here. If / is integrable on the 
rectangle S = [a, b\ x [c, d ] , we have to determine whether the value of the 
integral 


J J.s ^ X ' y ' )dxdy 


(6.33) 
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is equal to that of the repeated integrals 



f(x,y)dx 


dy and 


f( x i y)dy 


dx. 


This is true for continuous functions on S. But it is far from obvious that 
the existence of the double integral (6.33) guarantees the existence of either 
repeated integral. 

The following example may lead the reader to consider the point mentioned 
above. 

Examples Assume the function 


f(x,y)={ {x2+y2 
0 


for (x,y) ± (0,0), 
for (x,y) = (0,0), 


(6.34) 


and compute the repeated integrals 


lyx — 


dy 


f(x,y)dx 


and I xy = 


dx 


f( x i y)dy 


Straightforward calculations yield 


/,„= l'dxf'1 


o dy \x 2 + y 5 


dy = 


dx 7 r 

9 , 1 dx = - 
x 2 + 1 4 


and 


lyx — 


io Jo 
Hence, we conclude that 


1 , r 1 d_ 

' dx \x 2 + y 2 


dx = 


-dy , 

2 i i d y 

y 2 + 1 


Ixy 7 ^ ly 


7T 

4 


which indicates that the order of integrations with respect to x and y cannot 
be changed. 


We now present the main theorem of this subsection. 


6 Fubini theorem: 





Let the function f(x) be integrable on a 

rectangle S — [a, 6] x [c, d\ . Then 

the following equalities hold: 




J J f( x i y)dxdy = j 

/ f{x,y)dx 

J a 

dy= f b 
J a 

i 

R- 

a- 

dx. 
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According to Fubini’s theorem, a double integral f f s /( x, y)dxdy is computed 
by integrating first with respect to x and then with respect to y, or vice versa. 
We omit an exact proof of the Fubini theorem, since it requires rather lengthy 
arguments regarding the existence and the convergence of the double integrals. 
Instead, we present some applications of the theorem. 

The following is an extension of the Fubini theorem: 


4 Fubini— Hobson Tonelli theorem: 

Let the function f(x) be defined on S = [a, b] x [c,d]. Then, if either of 
the repeated integrals 


f 

[ 1 f(x,y)\dy 

f d 

dx or / 

[ \f(x,y)\dx 


J c 

J C 

J a 


exists, / is integrable on S and, hence, 



f(x,y)dxdy 


r b 

f d 

[ d 

r rb 1 

L 

/ f( x , y)dy 

J c 

dx = 

J C 

1 

1 


Both the Fubini theorem and the Fubini-Hobson-Tonelli theorem for integrals 
on a rectangle S may be easily extended to integrals on all of R 2 or to the 
integrals on any measurable subsets of R 2 . 


Exercises 

1. Suppose that the function 

g n {x) = -2 k 2 xe~ k2x2 + 2 (k + l) 2 xe~^ k+1)2x2 
is defined on [0,oo), and form the sum 

n 

f n {x ) = ^2 9 k(x ) = -2xe~ x2 + 2 (n + l) 2 xe~ ( ' n+1 ' >2x2 

fc= l 


Show that / 0 °° lirn^oo f n {x)dx ^ lirn^oo f n (x)dx. 

Solution: We have 


lim /, 


C°° / \ 

,{x)dx = J y—2xe~ x J dx = 


= - 1 , 


J o 


whereas 


lim / f n {x)dx = 


e -z 2 _ e -(rt+l) 2 x 2 


= 0 . 


J o 


Therefore, (6.23) is not valid. X 
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2. Given the function: 

for 0 < x < n/n, 
for 7r jn < x < tt, 


lim / f n (x)dx ^ / lim f n (x)dx. 
n-> °° J 0 Jo n ^°° 

Solution: We have f n (x) = 0 for every x in [0, 7r] and 

lim„ — r r Xj So /»( x)dx = 2. Hence, we obtain the desired result. Jh 


fn{x) = 


n sin nx 

0 


show that 


3. Suppose that the nonnegative functions {f n {x) : n € N} are each 
summable over a measurable set X, and f n < f n +i on X. Show that 
the limit function / = lim^—^ f n is summable over X and that 

lim / f n d^i — / fd/x. 

n >oc J x J x 


Solution: Let g n = f \ — f n , so that 0 = g\ < <72 < • • • < /i- Thus, 
the dominated convergence theorem ensures that linin—^ g n = 


fi~f is integrable, and we have lim 

n — too 



fn)dfX 


[ (/1 - fW , 

Jx 


which gives 



lim f n dg 

n — too 


[ (/1 - f)dfx. 
Jx 


Further, as / is integrable since 0 < / < f\, we have 


/ (/1 - f)dn = I fid/x - [ fdfx, 
Jx Jx Jx 


so that 



lim 

n — too 




fdn, 


which gives 



fnd/x 



Jit 


4. Examine the applicability to integrals j^° f n (x)dx of dominated and 
monotone convergence theorems for the following: (i) f n (x) = 2 n 2 e~ n x ; 
(ii) f n {x) = nxe~ nx2 
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Solution: 

(i) Setting y = nx, we have 


r°° r°° 2 2 r°° 2 

/ f n (x)dx = / 2 n 2 e~ n x dx = / 2ne~ v dy = n\fi r, 


Jo Jo Jo 

where the last term diverges as n — * oo. Hence, the liiri^^^ 
/ 0 °° fn{x)dx does not exist. Next, we observe that for x ^ 0, 


lim f n (x) = lim (2n 2 e n x 'j = 0, 


whereas for x = 0, 

lim /„( 0) = lim (2n 2 ) = oo. 

n — >-oo n — »oo ' ' 


Thus, there is no limiting function / = limn^oo /„ that sat- 
isfies the inequality f[x) > 2 n 2 e~ n x for all n in X , and we 
can conclude that neither the dominated nor the monotone 
convergence theorem is applicable. 

(ii) It is found that 


fn{x)dx 


r°° 2 


/ nxe~ nx dx = 

Jo 

2 


2 

and that nxe~ nx — > 0 pointwise as n — > oo. Therefore, the lim- 
iting function f(x) satisfying the inequality f(x) > nxe~ nx does 
not exist. Hence, neither the dominated nor the monotone conver- 
gence theorem is applicable. X 


5. Using Fubini’s theorem, derive the formula 


f 1 x b -x a 1 + b 

/ — dx = log for a,b>0. 

J 0 log x 1 + a 


(6.35) 


Solution: Note that the integral in the left-hand-side is beyond 
elementary calculus, so that it is impossible to achieve (6.35) by 
straightforward calculations. Instead, we observe that 


r b r 1 x b - x a 

dx / x y dy = I — dx 


' a 


tO 


logx 


dy [ x y dx = f 
JO J a 


= i og l±f. 

y+1 s l+a 


and 
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Thus, when we apply the Fubini theorem to the double integral 


[0<£C<1, a<y<b] 


x v dxdy, 


we obtain the desired result (6.35). £ 


6. Show that the function f(x,y) given in (6.34) in Sect. 6.3.2 is not inte- 
grate on [0, 1] x [0, 1]. 


Solution: It follows that 


0<x,y<l 

1 


2 2 
x — y z 


( x 2 + y 2 ) 2 


dxdy = 2 


2 2 
y — x 


0<x<y<l (x 2 + y 2 y 


dxdy 


= 2 


/ o 


dy L’ w^h dxdv = 


/■ ' dy 

lo v 


This means that the existence and equality of two repeated inte- 
grals do not guarantee the existence of the double integral. £ 


6.4 The Lebesgue Spaces L p 

6.4.1 The Spaces of L p 

We close this chapter by demonstrating the relevance of the Lebesgue in- 
tegral theory to the functional analysis that we discussed in Chap. 4. The 
Lebesgue theory on integration enables us to introduce certain spaces of func- 
tions that have properties that are of great importance in analysis as well 
as in mathematical physics, in particular, quantum mechanics. These are the 
so-called L p spaces of complex-valued functions f such that \f\ p is integrable. 

We have already dealt with the concept of Hilbert space. In fact, L 2 for 
any measure p satisfies the conditions for a Hilbert space. We begin with a 
short review of the definition of L p spaces in terms of measure, and follow this 
by examining how the spaces possess vector space properties owing to the use 
of the Lebesgue integral. 

Let p be a positive real number and let X be a measurable set in R. The 
L p space is defined as follows: 


4 Definition of L p space: 

The L p space is a set of complex-valued Lebesgue measurable functions 
f(x) on X that satisfy 



\f\ p dp < oo 


for p > 1. 
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When the integral f x \f(x)\ p dx exists, we call it the p-norm of / and denote 
it by 

m P =(j x \ f \^) P - 

Clearly for p = 2, the present definition reduces to our earlier definition of L 2 . 

6.4.2 Holder Inequality 

The following two inequalities are fundamentals that demonstrate the rela- 
tions between the norms of functions involved in L p . 

4 Holder inequality: 

For any f,gGL p under the conditions 

p,q> 1 and — \ — = 1, 

P Q 

we have 

fgGL 1 and ||/g||i < H/IUMIp- 


Proof We assume that neither f nor g is zero almost everywhere (otherwise, 
the result is trivial) . To proceed with the proof, we first observe the inequality 


i'/ p b 1/q < ® + - for a, b > 0, 


p q 

which we is justify by rewriting it as 


t 1/p < - + -, 

p q 

where we set t = a/b. Then, we note that the function given by 

f(t) = t 1/p - - - - < 0 

p q 

has a maximum at t = 1, namely, 

max f{t) = /( 1) = l- 1 --=0, 

P q 

which results in the inequality (6.36), which we use to obtain 

\.f(x)g(x)\ < A~ p \f(x)\ p B-^gjx)^ 

AB ~ p q 


t 1 


(6.36) 


(6.37) 
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where 



r 

i Ip 

r 

A = 

/ 1. f\ p dp 
.Jx 

and B = 

/ \g\ q dp 
.Jx 


The right-hand side of (6.37) is integrable from the hypothesis that f,g £ L p . 
Therefore, using (6.37) we obtain 


1 /' A~ p 

I I f9\dn< 


'x 


V Jx 

= 1 + 1 =i. 

p q 


| f\ p dp+— f | g\«dn 


q Jx 


Consequently, we have 


\fg\dp < AB , 


lx 


which proves the inequality. X 


6.4.3 Minkowski Inequality 

The other inequality of interest is stated below. 

4 Minkowski inequality: 

If /, g € L p with p > 1, then 

f + g £ L p and \\f + g\\p<\\f\\ p + \\g\\p. (6.38) 


Proof For p = 1, the inequality is readily obtained by integrating the triangle 
inequality for real numbers. For p > 1, it follows that 


/ | f + g\ p dn= f |/ + gr l \f\dp 
JX Jx 

+ [ I f + gr'Wp. 

Jx 

Let q > 0 be such that 

l + l = i. 

p q 

Applying the Holder inequality to each of these last two integrals and noting 
that ( p — l)q = p, gives us 


I f(x) +g(x)\ p dx < M 


lx 


|/ + 


1 1/9 


IX 


1/ + g\ p dg 


ix 


1/9 


= M 


(6.39) 




170 6 Lebesgue Integrals 


where M denotes the right-hand side of the inequality (6.38) that we would 
like to prove. Now divide the extreme ends of the relation (6.39) by 



i l /i 


\.f + d\ P d M 


to obtain the desired result. £ 


Remark. It should be noted that neither the Holder inequality nor the 
Minkowski inequality holds for 0 < p < 1 if fi(X) > 0, which is why we 
restrict ourselves to p > 1. 


6.4.4 Completeness of L p Spaces 

By virtue of the two inequalities discussed above, we can show the com- 
pleteness properties of L p spaces, which is crucially important for developing 
Hilbert space theory for Lebesgue measurable functions. 

4 Completeness of L p spaces: 

The space L p is complete: i.e., for any f n e L p satisfying 

lim \\f n -f m \\ =0, 

n,m — >oo 

there exists / £ L p such that 

lim || f n - f\\ p = 0. 

n — »oo 


Proof Let {/„} be a Cauchy sequence in L p . Then, there is a natural num- 
ber rii such that for all n> n\, we have 

II fn - /nj| < 

By induction, after finding rik-i > rik- 2 , we find rik > rik- 1 such that for all 
n > rik we have 

II fn ~ fnk || < jjfc • 

Then {f nk } is a subsequence of {/„} that satisfies 


1 

2k 
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or 

oo 

ll/m II +5Z l|/m+i “ /m|| = ^ < 00 • 

fc=i 


Let 

= |/m | + | fn 2 ~ fni !+••• + \fn k+1 ~ fn k |> k = 1, 2, • • • . 

Then, by the Minkowski inequality, 


[ 9l( x W= f (|/ml + I/m - /ml + ••• + l/m+i “ /ml) P ^M 
JX ./x 

( oo 

ll/m lip + l|/"fc+i — /m 

fe= 1 

< < oo. 



Let g = liiri g k . Then g p = lini g 1 !. By the monotone convergence theorem 
given in Sect. 6.2.1, we have 


/ g p dg = lim / g^dfi < oo, 
Jx fe — oo Jx 


which shows that g is in L p , and hence 


J ^l/m I ~k 'y 1 \fnk+i fnk 1^ dx < OO, 


implying that 


I/m I + X] l/m+i — /n 


fc= l 


converges almost everywhere to a function f £ L p . 

It remains to prove that ||/„ fc — /|| — > 0 as k — > oo. We first note that 


/(*) - [fn k+ i i x ) ~ fn k (x )] • 

k—j 


It then follows that 


OO OO 1 1 

11/ — fnj II / y ] ll/nfe + 1 — fn k lip < y ( 2 ^ = 2 J -1 • 
k—j k=j 


Therefore, ||/ - f n . || p -> 0 as j -» oo. Now 

II fn ~ /||p ||/n fn k ||p + ||/ra fc — /Up; 

where ||/„ - fn k ||p — ► 0 as n — > oo and fe — > oo and thus ||/„ - /|| p = 
0 as n — » oo. This shows that the Cauchy sequence {/„} converges to / 
in LP. A 
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Before closing this chapter, we must emphasize that if we employ the Riemann 
integral to construct L p spaces, the theorem mentioned above breaks down so 
that we can no longer expect completeness of the resulting function space. To 
illustrate this point, we temporarily define the ‘L 1 space’ by a set of Riemann 
integrable functions under the ‘1-norm’: 

\\f\\[ R] = [ \f{x)\dx < oo. 

Jo 

We then consider a function 

f ( f 1 for x € {a n }, 

Jn\x) q otherwise, 

where {a„} (n = 1,2, •••) is an infinite sequence of all rational numbers 
in [0,1]. It readily follows that a function f n (x) — / M ( x) € L 1 is Riemann 
integrable and reads 

II fn - fm\\[ R] = f | fn(x) - f„(x)\dx = 0. 

Jo 

Nevertheless, f n [x) converges to Dirichlet’s function x(a;), which is not 
Riemann integrable as noted earlier. As it is impossible to examine the quan- 
tity 

\\fn-X\\[ R] , 

using Reimann integrals, we cannot establish the complete function space 
based on that method. 


6.5 Applications in Physics and Engineering 


6.5.1 Practical Significance of Lebesgue Integrals 


From a practical viewpoint, what makes Lebesgue integrals so important is 
the fact that they allow us to interchange the order of integration and other 
limiting procedures under very weak conditions, which is not possible in the 
case of Riemann integrals. In fact, in the case of Riemann integrals, the iden- 
tities 

/ OO cOO 

f n {x)dx = / lim f n {x)dx 

-oo J- oo n ^°° 


and 


OO /»00 /»00 °° 

^2 / fn{x)dx = *22 fn{ x )dx 

n= l J -°° 


are valid only if the integrands on the right-hand side, i.e., lim f n and ]T] f n , are 
uniformly convergent. Such a restriction can be removed by using a Lebesgue 
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integral since with the latter, only pointwise convergence of the integrand 
is needed. We saw in Sect. 6.3 that the Lebesgue convergence theorem 
and Fubini’s theorem markedly weaken the conditions necessary for the 
validity of an interchange of the order of integration. As a result, we need 
not monitor the order of the limiting procedure, which is very useful in the 
practical calculations encountered in physics and engineering. 


6.5.2 Contraction Mapping 

Another important consequence of Lebesgue integral theory is the complete- 
ness of the function space L p spanned by Lebesgue integrable functions. 
L p spaces have a wide range of applications in physics, statistics, engineering, 
and other disciplines. For instance, they serve as a basis in the development of 
a rigorous theory of Fourier transformation, in which the mappings between 
two different L p spaces are considered. Moreover, the theory of quantum me- 
chanics is established on the basis of the L 2 space, a specific class of L p spaces 
with p = 2. In both applications, the completeness property of the L p space 
plays a crucial role in making the theory self-contained. In order for the reader 
to learn more about this issue, we present the contraction mapping theo- 
rem (or Banach’s fixed point theorem) below. This theorem proves the 
existence of a unique solution to a certain kind of equation associated with 
Lebesgue integrable functions, which makes the theory based on L p spaces 
self-contained. 

A preliminary terminology is defined below. 


4 Contraction mapping: 


A contraction mapping T is a mapping from L p onto L p that satisfies 

the relation 


\\T(f)-T{g)\\<c\\f-g\\ (0 < c < 1) 

(6.40) 

for any f,g € L p (see Fig. 6.6). 



LP LP 



Fig. 6.6. Sketch of a contraction mapping T acting on f,g £ L p 
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Remark. If T is regarded as a differential operator acting on a Lebesgue inte- 
grable function /, then we can say that ‘a contraction mapping is a mapping 
that satisfies the Lipschitz condition’ (see Sect. 15.2.3). 

We should keep in mind that the norm || • • • || used in (6.40) is in terms of 
L p spaces, so that \\f — g|| =0 means f = g almost everywhere. In plain 
words, a contraction mapping reduces the distance between two elements in 
the L p space. 

We are now ready to move on to the main theorem. 

6 Contraction mapping theorem: 

Let T be a contraction mapping and I be an identity mapping. Then 
the equation 

(T-/)/ = 0 (6.41) 

has one and only one solution / that belongs to L p . 


| Remark. The solution / of the equation (6.41) is called a fixed point in L p . 

The contraction mapping theorem guarantees the existence and uniqueness of 
fixed points of certain self-mappings and provides a constructive method for 
finding those fixed points. It should be emphasized that the theorem allows 
us to prove the existence (and uniqueness) of solutions of ordinary differen- 
tial equations with respect to Lebesgue integrable functions, as intuitively 
understood if T is set to be a differential operator. 

Proof (of the contraction mapping theorem): For arbitrary f 0 £ 

L p , we introduce a sequence of functions {/„} defined by 

fi = T(f 0 ), f 2 = T(fi), •••,/„ = T(f n _i), 

We shall see below that the sequence {/„} is a Cauchy sequence 
and thus has a limit / = limn^oo f n . It follows from the definition of 
T that 


ll/n-/n+ill = l|r(/„- 1 )-T(/ n _ 1+J -)|| 

— c ||/n-l — fn—l+j || 

< ••• <cl/o-/,|| (6.42) 


II /o — fj\\ < II /o - /ill H 1- ||/j-i - fj II 

< (i + c + --- + c 7+1 ) \\fo-fi\\ 
<(l-c)- 1 ||/o-/ 1 ||, 


(6.43) 
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where we used the Minkowski inequality (6.38) with respect to the 
p-norm designated by || • • • ||. From (6.42) and (6.43), we set 

c n 

II fn - fn+j II < yyyHI/o - /lll — *■ 0 (n -> oo). 

This indicates that {/„} is a Cauchy sequence and thus converges to 
a limit (denoted by /) regardless of the choice of fo- Furthermore, the 
limit / always belongs to L p since the space L p is complete. Hence, 
the converging behavior of {/„} to / can be expressed by using the 
concept of the norm of L p as 

lim ||/„ - /|| = 0. (6.44) 

n— ► oo 

We then obtain 

\\T(f)-f\\ < \\T(f)-fn\\ + \\fn-f\\ 

= \\T(f) — T(/„_ 1 )|| + ||/ n — /|| 

< \\f - fn-l\\ + \\fn- f\\ ->0 (n — > oo), (6.45) 

which means that T(/ ) = / almost everywhere. Consequently, equa- 
tion (6.41) has at least one solution that is a limit / of the sequence 
{/„} that we introduced. 

The uniqueness of the solution / is readily understood. Suppose 
g € L p such that T(g) = g. We then have 

\\g-f\\ = \\T(g)-T(f)\\<c\\g-f\\. 

This means that ||p— /|| =0 since 0 < c < 1, so we have g = f almost 
everywhere. X 

Remark. Note that it is our use of the Lebesgue integral (instead of the 
Riemann integral) that guarantees the validity of the contraction mapping 
theorem. In fact, if we restrict ourselves to the Riemann integral, the limit / 
of the sequence {f n } may not belong to L p , and we can no longer obtain the 
result (6.45). 


6.5.3 Preliminaries for the Central Limit Theorem 

The effectiveness of Lebesgue integrals is also observed in probability theory, 
particularly in the derivation of the central limit theorem, which plays 
a fundamental role in statistical mechanics and in the statistical analysis of 
experimental data. Later, we shall see that employing Lebesgue integrals is 
necessary for proving the central limit theorem, where the Lebesgue con- 
vergence theorem and Fubini’s theorem are used time and again. 
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In order to prove the central limit theorem, we introduce a random vari- 
able x (see Sect. 6.1.3); for instance, x may be the number of spots we get 
when shooting a pair of dice or a real number that we randomly pick from 
an interval on the real axis. Suppose that x lies in a set X on the real axis. 
(Here, X may be a continuous interval, a set of discrete points, or a union of 
the two.) In modern probability theory, measures characterizing the statisti- 
cal properties of the system considered are defined in terms of the Lebesgue 
integral. For instance, the probability (or distribution) that x is found in 
subset X 0 C X is given by 

P (x : x £ Xq) = / pd/i, (6.46) 

Jx o 

where /i is the Lebesgue measure of Xq and p is the probability density as- 
sociated with x. In general, p is assumed to satisfy the normalization condition 
f x pdp = 1. We can state that the random variables x and y are independent 
if 

P(x,y) = P(x)P(y). 

Moreover, the variables x and y are said to be identically distributed if 

P{x) = P{y). 

We also define the expected (or mean) value of x and the variance of 
x by the integrals 

E{x} = / xpdy and V{x}= / (x — E{x}) 2 pdp, 

Jx Jx 

respectively, where /i is the Lebesgue measure of X . In particular, the ex- 
pected value of an imaginary exponent e lzx , where z is real, is known as the 

characteristic function. 


6 Characteristic function: 

The characteristic function ip x (z) of a random variable x is defined by 

M*) = E{e izx }. 

It can be shown that 

E {e^ x +y)} = <p x {z)<p y {z) 

if and only if the random variables x and y are independent. Furthermore, 
we obtain 

Vx{z) = <Py(z) 

if and only if the variables x and y are identically distributed. The latter 
condition is known as the uniqueness theorem for characteristic functions, 
and the proof, which involves Fubini’s theorem, can be found in advanced 
texts on probability theory. 
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6.5.4 Central Limit Theorem 


We are now ready to state the key theorem. 


4 Central limit theorem: 

Assume a series of random variables {x n } in which the x n are indepen- 
dent and identically distributed. For arbitrary a and b (b > a), we have 


lim P 

n — »oo 



E n 

j=l X 3 ~ nm 
os/n 



where m = E{x n } and a 2 = V{x n }. 



(6.47) 


Briefly, the theorem states that the probability that the average of n random 
variables equals a is proportional to e~ a / 2 . (Note that a is the average of n 
variables and not a variable itself.) A random variable with the probability 
density e~ ^ / 2 is said to be normally distributed. 

Remark. The central limit theorem is very effective in describing various 
stochastic phenomena in nature since it can be applied regardless of the dis- 
tribution of the n random variables; i.e., almost all classes of random variables 
obey the theorem as long as they are independent and identically distributed. 

An illustrative example of the central limit theorem in physics is the 
Maxwell — Boltzmann distribution of an ideal gas. For a given tem- 
perature T, the distribution f(v) of the velocity of gas molecules v = |u| is 
known to satisfy the equation 


/ 0 ) 


/ m 2 

IwJ exp 



(6.48) 


where m is the mass of a gas molecule and k B is the Boltzmann constant. Here, 
the velocity v(U) as a function of discrete time t* (i = 1, 2, ■ • ■ , n) serves as 
n random variables. In general, in an equilibrium state, v(ti) for different t % 
is independent and identically distributed and, thus, if n is sufficiently large, 
the time average of u(t») obeys the normal distribution described by (6.48). 
Figure 6.7 shows the distribution of the squared velocity of gas molecules, 
which is determined from the formula 47ru 2 /(tr), for various values of T ; we 
set k B = 1.38 x 10 -23 kg • m 2 /s 2 • K and m = 6.6 x 10 -2 ' kg by considering 
4 He molecules. We observe that the mean value of v 2 shifts to the right with 
an increase in the temperature, which can be intuitively understood to be due 
to the acceleration of the molecules at high temperatures. 

It is important to emphasize that the central limit theorem holds good for 
any kind of distribution of the n variables {a:, } as long as they are independent 
and identically distributed. For example, let us consider n variables that obey 
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the distribution P(x) shown in Fig. 6.8. The average of these variables shows 
the distribution depicted in Fig. 6.8, all of which converge to the normal 
distribution as n increases. The fact that the distribution of {a;,} can be 
disregarded is the reason the normal distribution is so universally observed in 
a wide variety of stochastic phenomena. 


6.5.5 Proof of the Central Limit Theorem 


As some further points have to be discussed in order to prove the central 
limit theorem, we present below only an outline and not a rigorous proof. 
Let us emphasize that the use of Lebesgue integrals is necessary for proving 
the central limit theorem, and the Lebesgue convergence theorem and 
Fubini’s theorem are used time and again. 


Proof We have only to consider the case of m = 0 and a = 1; otherwise, 
the new variable x n = ( x n — m)/a is introduced to yield E{x n } = 0 and 
V{x n } = L The characteristic function <p yn (z) for the variable 


Vn = 


E n 

j=i x i 

\fn 


is given by 





1[P 


3=1 
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mean of {x.}: (i=1 ,2, . . . 10) 


Fig. 6.8. Top: Distributions of a random variable x. Bottom : (a)-(c) Distributions 
of the average value a of n random variables xi,X2,--- ,x n with n = 10 for (a), 
n = 100 for (b), and n = 1000 for (c). For each, 1000 a’s are sampled to create 
the distribution. With increasing n, the distribution of a converges to the normal 
distribution around the center of 0.125 as expected 

in which the condition that all x n are independent allows us to obtain the last 
expression. Furthermore, since all x n are identically distibuted, we have 



which is ensured by the uniqueness theorem discussed in Sect. 6.5.5. 
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We want the limit of Lp yn (z) at n — > oo, so we use the formula (see the 
lemma below) 

lim Vyn (*0 = e^ 2 % (6.49) 

n—> oo 

in which the right-hand side is the characteristic function of a normal dis- 
tribution. The result of (6.49) together with the continuity theorem (see 
below) states that 


1 l 2 

lim P{a < y n < b) = P(a < y < b) = / e~ v ! 2 dy . Jk (6.50) 

n ^°° V27 r Ja 

The following theorem forms the basis for the proof of the central limit 
theorem. 


4 Continuity theorem: 

Let x and x n be random variables such that 

lim Vx n ( z ) = 

n — >-oo 

We then obtain 

lim P(a < x n < b) = P(a < x < b) 

n — >oo 

for arbitrary a, b(b>a) satisfying P(x = a) = P(x = b) = 0. 


This theorem states that the convergence of characteristic functions implies 
the convergence of the corresponding distribution functions. Since the proof 
requires the use of Fubini’s theorem as well as the Lebesgue convergence 
theorem and is quite complicated, we do not present it. 


4 Lemma: 

If E{x} = 0 and E{ x 2 } = 1 for a random variable x, then the charac- 
teristic function tp x (z) satisfies the relation 


lim 

n — »oo 


Vx 




(6.51) 
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Proof The assumption that E{x 2 } = 1 < oo implies that <p x (z) is twice 
differentiable. In fact, we obtain 


= E{e izx } 1 

<p x \z)=E^e ixx }=E{ixe izx }, 
<Px"{z) = E j^e^j = E{-x 2 e izx } , 


where the Lebesgue convergence theorem was used to interchange the 
order of differentiation d/dz and integration f dx associated with calculation 
of E{ - ■ The twice differentiability of allows us to expand it around 
z = 0 as 

Px (-7=) = <Ar(0) + -5=<Px(0) + 7j~<Px"{r, l), 

\ \Jn ) \Jn Zn 

where 77 is small enough to be |jj| < \z\/y/n. Since <^ x (0) = 1 and <p x '( 0) = 
E{ix} = 0, we have 


log 


<P 



1 n 


n log cp 
nlog ( 1 + 



( n > 1 ), 


where we used the inequality \f"{rj)\ < 1 to expand the logarithmic term for 
n 1. As a result, we set 


lim log 

n — >00 



-1 n 



Z 


2 


2 ’ 


which is equivalent to (6.51). £. 
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Complex Analysis 



7 


Complex Functions 


Abstract Differentiation and integration of complex functions are significantly dif- 
ferent from those of real functions. In this chapter, we show that two very impor- 
tant theorems — the Cauchy theorem (Sect. 7.2.2) and the Taylor series expansion 
(Sect. 7.4.3) — result in a broad range of mathematical consequences that are highly 
relevant and useful in mathematical physics. However, before moving on to the 
principal discussion, we deal with the underlying concepts of analytic functions 
(Sect. 7.1.2) and the geometric meaning of analyticity (Sect. 7.1.5). 


7.1 Analytic Functions 

7.1.1 Continuity and Differentiability 

This chapter describes the theory of functions of a complex variable. Let C 
denote the set of all elements 2 of the form 

z = x + iy, 

where x,y € R and i is a familiar symbol defined by i 2 = —1. Let D be a 
domain in C . Then, a complex function defined by 

f : D —> C 

is a rule that assigns a complex-valued function f(z ) to each z € D. This f(z ) 
is equivalent to an ordered pair of real-valued functions u{z) and v(z). Thus, 
f(z ) can be written in the form 

w = f(z) = u(z) + iv(z). 

The real-valued functions u(z) and v(z) are called the real and imaginary 
parts (or components) of f(z) (see Fig. 7.1). We may write u = Re/ and 
v = Im/. 
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v = Im w 



Fig. 7.1. A complex function w = f(z) that assigns a point on the w-plane to each 
point on the 2 -plane 


Once we introduce complex functions, the concepts of differentiation and 
integration encountered in ordinary real calculus acquire new depth and sig- 
nificance. When f(z ) has its derivative in D , it is referred to as an analytic 
function in D. (More precise definitions of analytic functions are given in 
Sect. 7.1.3.) We shall see that the conditions for a complex-valued function 
f(z) to be differentiable with respect to a complex variable 2 is much stronger 
than that for a real-valued function f(x) with respect to a real variable x. 
This restriction forces a great deal of the structure of f(z). 

An exact definition of an analytic function is obtained by considering its 
derivative with respect to a complex variable 2 . Therefore, our first task is to 
determine the necessary and sufficient conditions for a complex function f{z) 
to have a derivative with respect to 2 . Before stating what is meant by the 
derivative we begin with the definition of continuity for f(z). 

4 Continuity of complex functions: 

Let / : D — > C be a complex function and Zo a point in D. Then, a 
function w = f(z) £ C is continuous at the point zq if 

lim f(z) = f(z 0 ). (7.1) 

>Zq 


In the limit of (7.1), the complex variable 2 may approach zo from any direc- 
tion in D (see Fig. 7.2). Hence, if we say the limit (7.1) exists, it means that a 
unique quantity f(zo) must result from the limiting process regardless of how 
the limit 2 — » zq is taken. 

A similar feature is found in the definition of the derivative of f(z). 

4 Derivatives of complex functions: 

A complex function f(z) is said to be differentiable at the point zq if 
and only if the limit 
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lim 

2— >20 


f(z) - /0 o) 

z- z 0 


z e D 


(7.2) 


exists and is uniquely determined regardless of the manner in which 2 
approaches zq. When the limit exists, we denote it by f'(zo), the derivative 
of f(z) at Zq. 


y=lmz 



The definition(7.2) requires that the ratio [f(zo + Az) — f(z 0 )\/Az always 
tend to a unique limiting value, no matter the path along which z approaches 
zq. This is an extremely strict condition; in fact, a number of theorems in the 
theory of analytic functions are derived from this requirement. 

Keep in mind that a function f(z) may be differentiable only at a point, 
or on a curve, or through a region. An example for a differentiable function 
at single point is presented in Example 3 in Sect. 7.1.2. 

7.1.2 Definition of an Analytic Function 

Among many differentiable functions, some specific kinds of functions form 
the class of analytic functions as stated below. 

Analytic functions: 

A function f(z) is said to be analytic at the point z = Zq if and only if 
it is differentiable throughout a neighborhood of z = Zq. 
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I Remark. There are some synonyms for the term analytic: holomorphic, 
regular, and regular analytic. 


We offer some comments on the distinction between differentiability and 
analyticity. As noted above, the conditions for /( x) to be analytic are more 
stringent than those for it to be differentiable; in fact, a function f(z) is said 
to be analytic at a point zq if it has a derivative at z o and at all points in 
some neighborhood zq. In this context, if we say that a function is analytic on 
a curve, we mean that it has a derivative at all points on a two-dimensional 
narrow strip containing the curve. If a function is differentiable only at a 
point or only along a curved line, then it is not analytic so that we say it 
is singular there. A typical example of f(x) that is differentiable only at a 
point is demonstrated in Example 3 below. 


Examples 1. The function f(z) = z n is differentiable and analytic every- 
where. In fact, the limit 


lim 

Az^O 

= lim 

Az^O 


(z 0 + Az) n - zS 

Az 

n - 1 , n(n-l) 2 


nz ( 


Az- 


( Az) r 


= nz. 


n— 1 


exists for arbitrary zq, and is clearly independent of the path along which 
Az — > 0. This means that any polynomial in z is differentiable and ana- 
lytic everywhere. 


2. The function f(z) = z* is neither differentiable nor analytic anywhere, 
since the limit yields 


lim 

Az^O 


(zo + Az)* - Zq 

Az 


lim 

Z\z-+ 


Az* 

o Az 


(7.3) 


If Az — > 0 parallel to the real axis, then Az = Az* = Ax so that the limit 
equals 1. However, if Az — » 0 parallel to the imaginary axis, then Az = 
iAy = — iAz * so that the limit equals — 1. Therefore, the quantity (7.3) 
depends on the path Az — » 0, which means that it is neither differentiable 
nor analytic anywhere. 

3. The function f(z) = \z\ is differentiable only at the origin. In fact, 


f(zo + Az) - f(zo) = (z 0 + Az)(zq + Azq) - z 0 Zq 
= zqAz* + ZqAz — AzAz* , 


which yields 


lim 

Az^O 


f(z o + Az) - f(z 0 ) 
Az 


ZqAz + zXAz — AzAz 
lim — A 

ziz^o Az 

CZ 0 + Zq, 
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where c = lim a z ^>o{Az* / Az) is a complex-valued constant that depends 
on the path of Az — > 0. Hence, the limit noted above is uniquely deter- 
mined only when Zq = 0, which means that the function f(z) = \z\ is 
differentiable only at a point z = 0. 


7.1.3 Cauchy— Riemann Equations 


Let / : D — > C with f(z) = u(z) + iv(z ) as usual. We give the necessary and 
sufficient conditions for a function f(z) = u(x, y) + iv(x, y) to be differentiable 
at a point zq € D. Let us assume that f(z) is differentiable at zo € D. Then 
we have 




lim 

zlz— 


Af 

o Az 


lim 

Az->0 


( Au 
\A^ 



Since f'(zo) exists, it is independent of the path Az — > 0; i.e., it is independent 
of the ratio Ay /Ax. If the limit is taken parallel to the real axis, Ay = 0 and 
Az = Ax, we have 


/'(zo) 


lim 
Ax — >0 


(Au 
\ Ax 



du . dv 
dx * dx ’ 


On the other hand, if the limit approaches the point zo along the line parallel 
to the imaginary axis, Ax = 0 and Az = iAy, then 

... ( Av Au\ dv .du 

j (zn) = lim — I-:- = & vgr#- 

Ay^o \Ay Ay J dy dy 

From the initial assumption, these two limits must be equal, so equating real 
and imaginary parts gives us 

du dv du dv . . 

dx = dy an dy = ~dx~ ^ ‘ ^ 

Equations (7.4) are known as the Cauchy— Riemann relations (abbreviated 
by CR relations) , and they are a necessary condition for differentiability. 

However, alone they are not sufficient, as they provide only necessary con- 
dition. This is because they were determined from special cases of the re- 
quirement of differentiability as demonstrated above. In fact, the sufficient 
conditions for the differentiability of f(z) at zo consist of the following two 
statements: 


4 Theorem: 

A function f(z) is differentiable at zq € D if and only if 

(i) the first-order partial derivatives of u{x, y) and v{x, y) exist and are 
continuous at z o, and 

(ii) those derivatives at Zq satisfy the CR equations. 
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Proof We prove that conditions (i) and (ii) imply the differentiability of f(z) 
at Zq £ D. (The converse was proven implicitly in the beginning of this sub- 
section.) From hypothesis (i), the functions u, du/d x, and du/dy are all 
continuous at the point zq = Xq + iyo , so we have 


Au = u(x o + Ax, y 0 + Ay) - u(x 0 , y 0 ) 
du . du . 

= 7 7- Ax + zz~Ay + £i Ax + s 2 Ay, 
ox oy 


(7.5) 


in approximation of the order Ax and Ay. In (7.5), the partial derivatives 
are equated at the point (£ 0 , 2 / 0 ), and the real numbers £\ and e 2 vanish as 
Ax, Ay — > 0. Using a similar formula for v(x, y), we have 


Af = f(z 0 + Az) - f(zo) = Au + iAv 

du . . . 

—Ay + E 1 Ax + e 2 Ay 
dy 


du 

= zz-Ax - 
dx 


+i 


du dv 

— Ax+—Ay + £3 Ax + s A Ay 


Using the CR equations that are supposed to hold at the point (£ 0 , 2 / 0 ) from 
assumption (ii) above gives us 

/ fl'u Qy \ 

Af = ( fa. + j (Ax + iAy ) + Ax(s 1 + *e 3 ) + Ay(s 2 + ie 4 ). 

Dividing the both sides by Az = Ax + iAy yields 

Af du .dv , .Ax , .Ay 

A^ = d/c +l d/c +{£l+ l£3) A ~z + (£2 + l£4) A~z ■ (7 ' 6) 


Since \Az\ = \/ (Ax) 2 + (Ay) 2 , we have 

\Ax\ < \Az\ and \Ay\ < \Az \ , 

so that 


Ax 

< 1 and 

Ay 

Az 


Az 


< 1 . 


(7.7) 


Hence, it follows from (7.7) that the last two terms in (7.6) tend to zero with 
Az — > 0 because lim ^ z ^ 0 e n = 0 (1 < n < 4). As a result, the limit 


Af du .dv 
Inn — - = — +z — 
Az^o Az dx dx 


(7.8) 


is independent of the path of Az — » 0, so the derivative f'(zo) exists. We 
thus have verified that f(z) is differentiable at Zo if conditions (i) and (ii) are 
satisfied. This completes the proof of the analyticity of f(z). A 
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Examples 1. Regarding the function 


f(z ) = z 2 = ( x 2 — y 2 ) + i(2xy ) = u + iv, (7.9) 


we have 


du dv 

di = X= ~dy' 


i dv 

and — = 2y 
ox 


du 

dy' 


(7.10) 


These equations mean that everywhere in the complex plane the CR rela- 
tions hold and the partial derivatives are continuous. Hence, the function 
(7.9) is analytic in the entire complex plane. Such analytic functions are 
called entire functions. 

2. We saw in Sect. 7.1.1 that the function f(z) = \z\ 2 = x 2 + y 2 is not 
analytic anywhere since it is differentiable only at the origin. In fact, it 
yields 

du du dv du 

— zy , — — 0 , 

ox oy ox oy 


which satisfy the CR relations only at the origin. 


7.1.4 Harmonic Functions 


The CR relations immediately provide one remarkable result that points to 
connections with physics. Provided that the CR relations hold in a region, we 
set 


d du d dv d dv d du 

dx dx dx dy dy dx dy dy 


(7.11) 


Here we assume the continuity of the second-order partial derivatives of u{x, y) 
and v(x,y), which allows us to interchange the orders of differentiation in 
the mixed partial derivatives in (7.11). (This qualification, however, can be 
dropped since the second-order partial derivatives of an analytic function are 
necessarily continuous as we prove later.) Equation (7.11) yields the Laplace 
equation: 


d 2 u 

dx 2 


d 2 u 

dy 2 


= V 2 w = 0. 


In the same way, it follows that 


V 2 i> = 0. 

Thus we set the following theorem: 

Theorem: 

Each of the real and imaginary parts of analytic functions satisfies the 
two-dimensional Laplace equation. 

Any function 4 > satisfying V 2 </> = 0 is called an harmonic function. Accord- 
ingly, if / = u + iv is an analytic function, then u and v are called conjugate 
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harmonic functions since V 2 w = V 2 t> = 0 holds. The fact that real and 
imaginary components of analytic functions satisfy the Laplace equation plays 
a crucial role in solving applied second-order partial differential equations. De- 
tail discussions on this point are presented in Sect. 9.4.3. 


7.1.5 Geometric Interpretation of Analyticity 

To gain in-depth insight into the nature of analytic functions, we reveal the 
geometric meaning of “analyticity.” We know that the analyticity of f(z ) 
within a domain D ensures the existence of the derivative f'(z) = df/dz 
defined by 

/'(,) = lim 
V ’ h—>0 h 

This suggests that at a point zq within D , 

f (~o + h)~ f(z 0 ) ~ f(z 0 )h 


(7.12) 


for an arbitrary complex number h the magnitude \h\ is sufficiently small. 

Let us consider the geometrical meaning of (7.12). For the discussion to 
be concrete, we assume, for the moment, that the derivative f'(z) takes the 
values 

f(z 0 ) = 1 + i and f'(zi) = 1 ^ 


at the points z o and z-\ in D. It then follows that 

f(z 0 )h = (1 + i)h =V2 (^j= + h = V2e 


T/4 h, 


(7.13) 


where h = \h \ (cos 0 + isinff) is a complex number having a certain argument 
9. Equation (7.13) means that f(zo)h is obtained through the rotation of the 
vector h by 7t/4 followed by multiplication by y/2. (Note that any complex 
number can be regarded as a vector on the two-dimensional complex plane.) 
Similarly, we have 

f'( Zl )h = (7.14) 

which states that f'{z{)h is obtained through the rotation of h by 2tt/3. The 
processes are schematically illustrated in Fig. 7.3. The vector h is depicted by 
thin arrows and the corresponding vectors f(z)h by thick arrows. Noteworthy 
is that the magnitude \ f'(z)h\ at both zo and Z\, is invariant no matter what 
direction the vector h takes; indeed it follows from (7.13) and (7.14) that 

\f(z 0 )h\ = V2\h\ and \f\z{)h\ = \h\. 


Hence, when the direction of h is shifted by increasing 9 , \f'(z)h\ remains 
unchanged so that the front edge of the vector f(z)h moves along a circle 
centered at the origin. 
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Fig. 7.3. Illustration of analyticity of f(z) at Zo . An infinitesimal circle on the 2 - 
plane centered at an analytic point is mapped to a circle on the w-plane with slight 
modulation 


Now we go back to (7.12), which says that if f(z ) is analytic at zq, the 
acquired vectors f'(zo)h given above are almost equal to the vectors f(zo + 
h) — f(zo). This implies that the magnitude \f(zo + h) — /(zo)| is almost 
invariant to the change in the direction of h characterized by 9. Thus as 9 
increases, the front edge of f(zo + h) — /(zo) should trace a circle centered at 
the origin. (To be precise, the radius may be subjected to a slight fluctuation, 
as shown in Fig. 7.3, owing to contributions from higher-order terms than h 2 .) 
In other words, since /(zo) is fixed, an increase in 8 from 0 to 27r results in 
movement of f(zo + h) along the circle centered at /(zo). This means that for 
analytic functions /(z), the change in the magnitude of / for an infinitesimal 
change in 2 is isotropic. This isotropy is the geometric interpretation of the 
analyticity of f(z). 

Better understanding can be attained by considering the case of nonana- 
lytic functions. Let us use the same argument for the function 

f(z) = x 2 + iy, (7.15) 


where u = x 2 and v = y. This function is not analytic, since it does not satisfy 
the CR relations. Indeed, 


du 

dx 


2x^1 


dv 

dy 


except at x = 1/2. For such a nonanalytic function, the isotropy regarding the 
magnitude of the difference |/(z + h) — /(z)| for infinitesimal h breaks down, 
as is shown below. Once we set h = |/i|(cos+isin$) with \h\ = e = const, 
we have 


/(zo + h) = (xo + ecos 0) 2 + i(y 0 + esin0) 
~ Xq + 2e cos 9 ■ x 0 + iyo + iz sin 9 
= /(zo) + 2e cos 9 ■ xq + ie sin 9 , 


(7.16) 
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Fig. 7.4. Schematic illustration of nonanalyticity. When f(z) is not analytic at 
z = zo, then an infinitesimal circle centered at zo is mapped to an ellipse so the 
isotropy breaks out 

up to the order of e. Equation (7.16) indicates that when 0 increases, the front 
edge of the vector f(zo+h) moves along an ellipse that has a major axis of 2xq£ 
and a minor axis e (see Fig. 7.4). That is, the magnitude \f(zo + h) — f(zo) | is 
no longer isotropic, but depends on the direction of h (except for the particular 
case of x 0 = 1/2). 

Exercises 

1. Show that f(z) is continuous at z o if it is analytic there. 

Solution: From the identity, we have 

/(z) - /(z o) = /(z o + Az) - f(z 0 ) = Az ■ ^ + 
and with the definition Az = z — zq, we set 

lim [f(z 0 + Az) - f(z 0 )} = ( lim Az) f'(z 0 ) = 0. 

/AZ — ^0 \/AZ — ^0 / 

Moreover, if we write f(z) = u(z) +iv(z), it follows that u(z) and 
v(z) are both continuous. 

2. Express the Cauchy-Riemann relations in polar coordinates ( r,9 ). 

Solution: By imposing z = x + iy = re l6 , we transform the 

partial derivatives in terms of x into d/dx = (dr / dx){d / dr) + 

(d6 / dx)(d / dd) . After some algebra, we obtain d/dx = cos 0(d/dr)— 
(sin 0/r)(d/d0), which, together with the same procedure with re- 
spect to d/dy, yields the polar form of the CR relations as 

du 1 dv du dv 
dr r dO 1 dO ? dr 

Their abbreviated forms read u r = vg/r and ug = —rv r . A 
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3. If f(z) is analytic in a region D and if \f(z)\ is constant there, then f(z ) 
is constant. Prove it. 

Solution: If |/| = 0, the proof is immediate. Otherwise we have 

u 2 + v 2 = c^ 0. (7.17) 

Taking the partial derivatives with respect to x and y, we have 
uu x + vv x = 0 and uu y + vv y = 0. Using the CR relations, we 
obtain uu x — vu y = 0 and vu x + uu y = 0, so that 

(u 2 + v 2 )u x = 0. (7.18) 

From (7.17) and (7.18), and from the CR relations, we conclude 
that u x = v y = 0. We can obtain u y = v x = 0 in a similar manner. 
Therefore, / is constant. X 

4. Let <p(x,y) and tp(x, y) be harmonic functions in a domain D. Show that 
if we set u = 4> v — il> x and v = <j> x — if) y , the function f(z) = u + iv with 
the variable z = x + iy becomes analytic in D. 

Solution: It follows that U x — V y = ( (f> yx — lp xx ) — ( (f> X y ~ V’ yy ) = 

— V 2 0, where ip yx = ip xy was used. Since V 2 ^ = 0, we have u x = 
v y . Similarly, we obtain u y = — v x . Hence, u and v satisfy the CR 
relations in D, which indicates the analyticity of / on R. A 


7.2 Complex Integrations 


7.2.1 Integration of Complex Functions 


We now turn to the integration of functions f(z) with respect to a complex 
variable z. The theory of integration in the complex plane is just the theory 
of the line integral as defined by 


f(z)dz = 


N- 


lim 

oo.Az 


N 


+0 ^ 
i=l 


f(zi)Azi. 


Here ( Azi ) is a sequence of small segments situated at Zi of the curve that 
connects the complex number oq to the other number 0:2 in the 2 -plane. 
Since there are infinitely many choices for connecting oq to < 22 , it is possible 
to obtain different values for the integral for different paths. 

Examples Assume the contour integral 


I = j) z*dz 

from 2=1 to 2 = — 1 along the three paths (see Fig. 7.5): (i) the unit circle 
centered at the origin in the counterclockwise direction, designated by C 1 ; 
(ii) that in the clockwise direction, denoted by C 2 ] and (iii) the real axis, C 3 . 
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Fig. 7.5. Three paths. Ci, C 2 , and C 3 


(i) The values of z on the circle are given by z = e l9 , so dz = ie ie dd. Thus, 

I (Ci) = (f z*dz = [ e~ ie ie ie d6 = ni. (7.19) 

J C\ J 0 

(ii) In a similar manner as in (i), we have 

I(C 2 ) = <f z*dz = [ e~ lf> ie ie de = -ni. (7.20) 

J C 2 J 0 

(iii) On the real axis, z = x and dz = dx so that 

I(C 3 ) = <£ z*dz = f xdx = -2. (7.21) 

Jc« Ji 


In general, complex integrals on the path C possess the following property: 

4 Darboux’s inequality: 

Contour integrals on a path C satisfy the relation 

I f(z)dz < ML , (7.22) 

Jc 

where M = max \f(z)\ on C and L is the length of C. 
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This property is very useful because in working with complex line integrals it 
is often necessary to establish upper bounds on their absolute values. 

Proof Recall the original definition of complex integrals: 

P n 

/ f(z)dz= lim y ^/(,g fc )Azfc. 

I ri n— ► oo z ' 

JC k= 1 


It follows that 

n n n 

^2f{z k )Az k <£|/(z fc )l l^gfcl < M |Z\z fc | < ML, 

fc= i fe=i fc=i 

where we have used the facts that |/( 2 )| < M for all points 2 on C, that the 
^ represents the sum of all the chord lengths joining z k - 1 and and 

that this sum is not greater than the length of C. Taking the limit of both 
sides, we obtain the desired inequality (7.22). £ 


7.2.2 Cauchy Theorem 


We are now in a position to proceed with the key theorem in the theory of 
functions of a complex variable. Consider the complex integral 


I(Ci) 


<j> sin zdz 


along the closed paths C.\ (i = 1,2,3) shown in Fig. 7.6: (a) C\ = OP , (b) 
Ci = OQ + QP , (c) C 3 = OR + RP. After some algebra, we obtain 



Fig. 7.6. Three paths: OQP, OP, and ORP 
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I (Ci) = I(C 2 ) = I(C 3 ) = ..., 


which gives us the possibility that the integral from O to B remains invariant 
in quantity for our choices of integration paths. Actually, this is entirely true; 
it depends only on the two endpoints O and B. This peculiarity of integration 
comes from the fact that the integrand sin z is analytic on the integration 
paths in question. (In fact, it is analytic everywhere on the complex plane.) 
This result can be generalized to the following statement, called Cauchy’s 
theorem, which is pivotal in the theory of complex function analysis. 


4 Cauchy’s theorem: 

If f(z) is analytic within and on a closed contour C, then 



(7.23) 


The somewhat lengthy discussions that are needed for a proof of Cauchy’s 
theorem, are beyond the scope of this textbook, but two immediate corollaries 
of the theorem are listed below. 


4 Path independence: 

If f(z) is analytic in the region R and if contours C\ and C 2 lie in R 
and have the same endpoints, then 



fdz. 


The proof readily follows by applying Cauchy’s theorem to the closed contour 
consisting of C 2 and —C\ as shown in Fig. 7.7: 





Intuitively, the symbol — C denotes the contour C traced in the opposite 
direction. A discussion on the path independence follows the theorem below. 


4 Uniqueness of the integral: 

If f(z) is analytic within a region bounded by a closed contour C, then 
the integration f(z)dz along any contour within C depends only on z\ 
and z 2 . 


This theorem states that an analytic function f(z) has a unique integral not 
only a unique derivative. From a practical viewpoint, this theorem is frequently 
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used in the evaluation of contour integrals, since it allows us to choose an 
appropriate contour. 

Remark. When integrating along a closed contour, we agree to move along 
the contour in such a way that the enclosed region lies to our left. An inte- 
gration that follows this convention is called integration in the positive sense. 
Integration performed in the opposite direction acquires a minus sign. 


7.2.3 Integrations on a Multiply Connected Region 



Fig. 7.8. Left : A simply connected region. Right : A multiply connected region 


We should note that Cauchy’s theorem applies in a direct way only to sim- 
ply connected regions. A region R is said to be simply connected if every 
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closed curve in R can be continuously contracted into a point without leav- 
ing R. Otherwise, it is said to be multiply connected; (see Fig. 7.8). The 
physical reason for this restriction is easy to find. The important fact is that 
Cauchy’s theorem is a restatement that no singular point is included within 
the region bounded by the contour C. If the region R bounded by C is multi- 
ply connected, it becomes possible to put on singular points within the closed 
contour C but surely outside the region R in question. In this case, Cauchy’s 
theorem no longer holds even though the integrand f(z) is analytic everywhere 
in the region. 

Nevertheless, there is still a way to apply Cauchy’s theorem to multiply 
connected regions, which is based on allowing the deformation of contours as 
described below. 

Suppose that f(z ) is analytic in the region that lies between two closed 
contours C and C", where C encloses C ' . Draw two lines AB and EF close 
together, so as to connect the two contours. Then ABDEFGA described as 
shown in Fig. 7.9 is a closed contour, which we denote by S and f(z) is analytic 
within it. Then, we have 

f(z)dz = 0. 

Now let the lines AB and FE approach infinitely close to one another. The 
contribution from the part BDE tends toward the integral around C in the 
positive (i.e. , counterclockwise) direction. Similarly, the contribution from 
FGA tends toward that around C' in the negative (clockwise) direction, thus 
minus that around C' in the positive direction. The contributions from AB 



0 


x 


Fig. 7.9. Closed contour of ABDEFGA that consists of C and C' 
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and EF approach equal and opposite values since they ultimately become the 
same path described in opposite directions. We thus come to the conclusion 
that 


This means that, if a function is analytic between two contours, its integrals 
around both contours have the same value. 

Remark. There is an immediate extension to the case where C encloses several 
closed paths all external to one another. Because of Cauchy’s 

theorem, an integration contour can be moved across any region of the complex 
plane over which the integrand is analytic without changing the value of the 
integral. It cannot be moved across a hole (the shaded area) or a singularity 
(the dot), but it can be made to collapse around either, as shown in Fig. 7.10. 
As a result, an integration contour C enclosing n holes or singularities can 
be replaced by n separated closed contours C*, each enclosing a hole or a 
singularity as given by 


Fig. 7.10. Collapse of an integration path onto the boundaries of a hole (a large 
shaded region) and singularity (a small shaded dot) 


7.2.4 Primitive Functions 

Here is a definition of the primitive function of a complex function. 

4 Primitive function: 

Let f(z) be a function that is continuous in a domain D and has the 
property <j> c f{z)dz = 0 for every closed path C in D. Then, the primitive 
function F(z) of f(z) is defined by 





°o 


F(z) = f f{z')dz' (z 0 ,zeD), 
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which is analytic in D with the derivative 


dF(z) 

dz 


/(*)• 


Proof Consider the differential 

oz-\-Az 


nZ-\-L!Z nZ r>Z-\~Az 

F(z + Az)-F(z) = / f(z')dz / f(z')dz' = / f(z')dz', (7.24) 

J Zq J Zq J Z 

where we make use of the path-independence property. If we write 


pz-\-Az rz-\-Z\z nz-\-ZAz 

/ f(z')dz' = f(z) f(z')dz',+ [f(z')-f{z)}dz' 

J Z j Z j z 

t>z-\-Az 

= f(z)Az + / [f(z') - f(z)\dz', 

J Z 

then (7.24) becomes 

pz-\-Az 

F(z + Az)-F(z)-f(z)Az= [f{z')-f{z)}dz'. (7.25) 

J Z 


fZ-\-Az 


nz-\-Az 


Since f(z) is continuous, corresponding to an arbitrary small positive number 
£, there is a number 6 such that 

\z-z'\<8 => \f(z) - f(z')\ <e. 

Now choose \Az\ < 6, which ensures | z — z'\ <6 for z' on the path C in 
question. Therefore, we have 


fZ-\-Az 


[f(z')-f(z))dz', 


f*z-\-Az 


< 


and (7.25) can be written as 

\F(z + Az) -F(z) 


Az 


- m 


\f(z')-f(z)\\dz'\<e\Az\ 


< e for \Az\ < 8. 


Since e can be arbitrarily small, we conclude that 

F(z + Az) - F(z) 


lim 

Az^O 


or equivalently, 


Az 

dF(z) 


= /(*), 


dz 


= /(*)• 


This result is obtained for any point in D , so F(z) is analytic in D. A 
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Exercises 


1. Evaluate the integral 


1(C) = [ sin zdz 

Jc 


(7.26) 


along the two paths shown in Fig. 7.11: (a) C\ = OB , (b) C 2 = OA+AB. 

Solution: Since sin z = sin(x + iy) = coshy sin 2 ; + * sinh y cos x 

and dz = dx + idy , we can divide (7.26) into real and imaginary 
parts as 

1(C) = / (cosh y sin xdx — sinh y cos xdy) 

Jc 

+ / (cosh y sin xdy + sinh y cos xdx) . 

Jc 

Noting x = y along the curve C\, we have 

I(Ci) = (l + i) / cosh x sin id® — (1 — i) / sinh x cos idx 

Jo Jo 

— [cosh x cos x]g + [sinh x sin x]q 

= (1 — cosh 1 cos 1) + *(sinh 1 sin 1), (7.27) 

where we employ partial integrations. Next we evaluate I along 
C 2 . Along the path from O to A, x = 0 and dx = 0, and along the 
path from A to B, y = 1 and dy = 0. Therefore, 


I(C 2 ) = [ sin zdz 

Jc 2 


c 2 

= — / sinh ydy + / cosh x sin xdx + i sinh x cos xdx 

Jo Jo Jo 

= (1 — cosh 1 cos 1) + i (sinh 1 sin 1) . 


204 7 Complex Functions 


Observe that I(C\) = I{C 2 )■ X 
2. Set C : \z\ = r, and calculate the following integrals: 


(i) 




(iii) 


\dz\ 

z 


Solution: Let 2 = re 10 , which yields dz = ire l0 d9 and \dz\ = rdO. 
Hence, we have the results: (i) | j> c dz/z\ = | {ire 10 ) /{re l0 )dd\ = 

2tt, (ii) § c dz/\z\ = f^{ ire i0 )/rd6 = 0, (iii) § c \dz\/z = 
r/{re l0 )dd = 0 . X 

3. Let f{z) be analytic on a unit circle D about the origin. For any two 
points z\ and z 2 on D , there exists two points and £ 2 on the line 
segment [z\ , z 2 ] that satisfy the relation 

f(z 2 ) - f(z 1 ) = {Re 1 )] + ilm [/'(6)]} (z 2 - Zx). (7.28) 

Prove it. (This is a generalization of the mean value theorem that is 
valid for real functions.) 


Solution: From assumptions, we have 


f{z2) - f{zi) = [ f'{z)dz = (z 2 - zx) f f'[zx + t{z 2 - zx)]dt 
Jz! JO 

= (z 2 - Zx) Re [/' (zx + t(z 2 - Zx))] dt 

+i J Im [/' (zx + t{z 2 - 21 ))] dt| . (7.29) 


Note that the integrals in the last line are both real. Hence, they 
satisfy the mean value theorem for integrals of real-valued func- 
tions g{t) that are expressed by 


g[zx + t{z 2 — zx)]dt = g[zx + c(z 2 — Zx)\ when 0 < c < 1. 


Setting £& = Zx + Ck{z 2 — Zx) with 0 < Ck < 1 (k = 1,2), we 
get the desired equation (7.28). X 


7.3 Cauchy Integral Formula and Related Theorem 

7.3.1 Cauchy Integral Formula 

We now turn to the famous integral formula that is the chief tool in the 
application of the theory of analytic functions in physics. 
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4 Cauchy integral formula: 

If f(z) is analytic within and on a closed contour C, we have 


i 7M dz = 

Jc z ~ a 


27 rif(a) if a is interior to C, 
0 if a is exterior to C. 


(7.30) 


Proof The latter case is trivial; when z = a is exterior to C , the integrand in 
(7.30) becomes analytic within C so that we have at once §[f{z) /(z—a)\dz = 0 
by virtue of the Cauchy theorem. Hence, we consider below only the case where 
z = a is within C. 

Suppose that the integral 


J(a) = l M-dz (7.31) 

Jc z ~ a 

around a closed contour C within and on which f(z) is analytic. In view of 
the discussion in Sect. 7.2.3, the contour C may be deformed into a small 
circle of radius r about the point a. Accordingly, the variable 2 is expressed 
by z = a + re l9 . 

Now, we rewrite (7.31) as 


J(a) = /(a) (f dz + (f 

Jc a J c 


f( z ) - /(«) 


dz. 


z — a 


(7.32) 


The first integral on the right-hand side becomes 

dz 


f 2lr ire lS 


C z-a 


re 


i9 


dd = 2iri. 


(7.33) 


Hence, (7.30) is confirmed if the second integral of (7.32) vanishes for some 
choice of the radius r of the circle C. To show this, we note the continuity of 
f(z) at a, which tells us that for all e > 0 there exists an appropriate quantity 
S such that 

\z-a\<6 => \f(z) - f{a)\ < e. 

This implies that for any arbitrarily small e, we can find r = \z — a| that 
satisfies the relation 


f( z ) - /(a) 


z — a 


dz 


= (7i34 ) 

c \z~a\ d 
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Thus by taking r small enough, but still greater than zero, the absolute 
value of the integral can be made smaller than any preassigned number. From 
(7.32 to 7.34), we obtain the desired equation: 


/(*) 


z — a 


dz = 2tt if (a) if a is within C. £ 


(7.35) 


Remark. If a is a point located just on the contour C, the integral (7.30) will 
have the principal value integral (see Sect. 9.4.1). 


The Cauchy integral formula gives us another hint by which to comprehend 
the rigid structure of analytic functions: If a function is analytic within and 
on a closed contour C, its value at every point inside C is determined by its 
values on the bounding curve C. 


7.3.2 Goursat Formula 


A remarkable consequence of the Cauchy’s integral formula is the fact that, 
when f(z) is analytic at z = a, all of its derivatives are also analytic. Fur- 
thermore, the region of analyticity for those derivatives is identical with that 
of f(z). To prove the theorem, we use the integral representation (7.35) to 
evaluate the derivative, 


2m f (a) 

= 2„ lim /( ° + 'i ) - /( ° ) = lim 


h — >0 


= lim 


h 

f(z) 


h-> 0 
-dz = 


f(z) 

( z — a) 2 


— a — h z — 


dz. 


h^o J (z — a)(z — a — h) 

The last equality in (7.36) is verified from 

m m 


dz 

(7.36) 


= h 


(. z — a — h)(z — a ) (z — a) 2 \ 

/(*) 


(z — a) 2 (z — a — h) 


dz < 


dz 

h.ML 


b 2 (b-\h.\y 


(7.37) 


where M is the maximum value of \f{z) \ on the contour, L is the length of the 
contour, and b is the minimum value of \z — a\ on the contour. The right-hand 
side of the inequality in (7.37) approaches zero as h — » 0, so we have 


lim 

h — »0 


/ 


(z 


m 

a)(z — a — h) 


f(z) 

(z — a) 2 _ 


dz = 0, 


which ensures the equality of the last part of (7.36). 
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We can continue with the same process to obtain higher derivatives, ar- 
riving at the general formula for the nth derivative of / at 2 = a: 

A Goursat formula: 

If f(z) is analytic within and on a closed contour C, we have 

= (» = 0, 1 ,2,'"). (7.38) 


Note that equation (7.38) guarantees the existence of all the derivatives 
f'(a), /"(a), ■ ■ ■ and the analyticity at all a’s within C. 

Remark. The Goursat formula (7.38) is valid only within the contour, and 
thus gives no information as to the existence of the derivatives just on the 
contour. 


7.3.3 Absence of Extrema in Analytic Regions 

An additional noteworthy fact associated with Cauchy’s integral formula 
(7.30) is that it points up the absence of either a maximum or a minimum of 
an analytic function within a region of analyticity. 

For example, if z = a is a point within C, from (7.30) we see that 

1 f 

/(a) = — /(a + pe l<t, )d(j), (7.39) 

which means that /(a) is the arithmetic average of the values of f(z) on any 
circle centered at a. We thus have |/(o)| < M, where M is the maximum 
value of | /| just on the circle. (Equality can occur only if / is constant on the 
contour.) 

The above argument applies to arbitrary points within the circle and, 
further, to a region bounded by any contour C (not necessary a circle). We 
thus conclude that the inequality \f(z)\ < M holds for all z within C, which 
means that |/(^)| has no maximum within the region of analyticity. 

Similarly, if f(z) has no zero within C, then 1 / f(z) is an analytic function 
inside C and |l//(z)| has no maximum within C, taking its maximum value 
on C. Therefore |/(z)| does not have a minimum within C but does have one 
on the contour C. We thus arrive at the following important theorem. 

Absolute maximum theorem: If a nonconstant function f(z) is ana- 
lytic within and on a closed contour C, then |/(z)| can have no maximum 
within C. 
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Absolute minimum theorem: 

If a nonconstant function f(z) is analytic within and on a closed contour 
C, and if f(z) 0 there, then |/(z)| can have no minimum within C. 

Accordingly, points at which df/dz = 0 are saddle points, rather than true 
maxima or minima. 

We further observe that the theorems apply not only to \f(z)\ but also to 
the real and imaginary parts of an analytic function. To see this, we rewrite 
(7.39) as 

p2,7V /*27T 

27r/(a) = 27 i{u a + iv a ) and 2nf(a) = / f(x + iy)d(f> = (u + iv)d(f>, 

J o J o 

(7.40) 

where u a and v a are the values of u(x,y) and v(x, y) at z = x + iy = a. 
Equating the last terms of the two equations in (7.40), we obtain 

1 f 2n 1 f 2n 

u a—^ ndrf 1 an d V a = — / vd(f>, 

^ Jo ^ Jo 

so that u a and v a are the arithmetic averages of the values of u(x,y) and 
v(x,y), respectively, on the boundary of the circle. Hence, based on the same 
reasoning as above, we see that both of u and v take on their minimum and 
maximum values on the boundary curve of a region within which / is analytic. 

7.3.4 Liouville Theorem 

We saw in the previous discussion that \f{z)\ has its maximum M on the 
boundary of the region of analyticity of f(z). In certain cases, the maximum 
of \f(z)\ bounds the absolute value of derivatives |/^(^)|, as stated in the 
theorem below. 


4k Cauchy inequality: 

If f(z) is analytic within and on a circle C with a radius r, and M(r) is 
the maximum of |/(z)| on C, then we have 


f {n \z) 


< ^M(r) 


within and on C. 


This is called the Cauchy inequality. 


Proof Goursat’s formula reads 


&\z 0 ) 



m 

(z - z 0 ) n+1 


dz. 
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Take \z — z 0 | = r and use the Darboux inequality to get the desired result: 

/«(„) < ^ i Ji-i dA <* M(r) .to r 

J y ' ~ 2iri J c |O-2 0 ) n+1 | “ 2Trr n+1 y ’ 

= n ^P- * (7.4i) 


If the f(z ) we have considered is analytic at all points on the complex plane, 
i.e., if it is an entire function, the above result reduces to the following 
theorem: 


4 Liouville theorem: 

If f(z) is an entire function and \f{z)\ is bounded for all values of z, 
then f(z) is a constant. 


Proof Let n = 1 and M{r) = M in (7.41) to obtain 


\n*o)\ < )'■ 


Since f(z) is an entire functions we may take r as large as we like. Thus we can 
make |/ , (zo)| < £ for any preassigned £. That is, |/'(zo)l = 0, which implies 
that f'(zo) = 0 for all zq, so /(zq) = const. Jl» 


Liouville’s theorem is a very powerful statement about analytic functions over 
the complex plane. In fact, if we restrict our attention to the real axis, then it 
becomes possible to find many real functions that are entire and bounded but 
are not constant; cosx and e~ x are cases in point. In contrast, there is no 
such freedom for complex analytic functions; any analytic function is either 
not bounded (goes to infinity somewhere on the complex plane) or not entire 
(is not analytic at some points of the complex plane). 


7.3.5 Fundamental Theorem of Algebra 

The next theorem follows easily from Liouville’s theorem and provides a re- 
markable tie-up between analysis and algebra. In what follows, the points z 
at which f(z) = 0 are called the zeros of f(z) or roots of /(z). 

A Fundamental theorem of algebra: 

Every nonconstant polynomial of degree n with complex coefficients has 
n zeros in the complex plane. 
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Proof Let P(z) be any polynomial. If P(z) ^ 0 for all z, then the function 
f(z) = 1 /P(z) is entire. Moreover, if P is nonconstant, then P — > oo as 
z — » oo so that / is bounded. Hence, in view of Liouville’s theorem, / must be 
a constant. This result means that P is also a constant, which is contrary to 
our assumption that P is a nonconstant polynomial. We thus conclude that 
P(z) has at least one zero in the complex plane. 

Furthermore, an induction argument shows that an nth-degree polynomial 
has n zeros (counting multiplicity; see Remark 1 below). If we assume that 
every fctli-degree polynomial can be written 


P k {z) = A(z - ai) • • • (z - a k ), 


it follows that 


P k +i{z) = A(z - a 0 )(z -ai)---(z- a k ). X 


Remark. 

1. The point a is called a zero of order k (or zero of multiplicity k) of 
the function P(z) if it reads 

P{z) = (z-a) k Q(z), 

where Q(z) is a polynomial with Q(a ) ^ 0. Equivalently, a is a zero of 
order k if 

P{a) = P\a) = ■■■ = P^\a) = 0 and P {k \a) ± 0. 

2 . It can be shown that if fi(z) and f 2 (z) are analytic within and on C and 
if 1 / 2 ( 2 )! < |/i( 2 )| 7 ^ 0 on C, then fi(z) and fi(z) + ^( 2 ) have the same 
number of zeros within C. This is called Rouche’s theorem, which is 
verified in Sect. 9.3.4. 


7.3.6 Morera Theorem 

The final important theorem is called Morera’s theorem and, is in a sense the 
converse of Cauchy’s theorem. 


4 Morera theorem: 

Let f(z) be a continuous function on some domain D and suppose that 



for every simple closed curve C in D whose interior also lies in D. Then / 
is analytic in D. 
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Proof For some fixed point Zq in D , define the function 


F(z)= [ Z f(z')dz', z £ D, 

J Zq 


where the path is along the line segment in D from Zq to z. From this, we 
have 


F(z + Az)-F(z) = f(z) 
Az Az 


/ *z-\-Az 


dz' - 


1 

'Az. 


f>z-\-Az 


if(z') - f(z)}dz’ = f(z), 


where Darboux’s inequality is used in the second term in the limit Az — > 0. 
As a result, we get 

F\z) = f(z), 

which indicates the existence of the first derivative of F(z), so F(z) is analytic 
in D and f{z) is also analytic. 


Exercises 

1. Let f(z) be analytic within a circle D : z = \R\, and let it satisfy the 
relations \f(z)\ < M and /( 0) = 0. 

(i) Prove that 

\f(z)\<^\z\ for zeD. (7.42) 

(ii) Prove that the equality in (7.42) holds a,t z = zq if and only if there 
exists a complex number c that yields |c| = 1 and 

, s M 

f(z 0 ) = C^z. (7.43) 

Statements (i) and (ii) constitute the Schwarz lemma. 

Solution: (i) Equation in (7.42) holds trivially for z = 0. For 

considering the case of z yf 0, we specify the circle D' : \z\ = p < R 
and set the function g{z) = f{z)/z. Since g is analytic within and 
on D\ it follows from the theorem in Sect. 7.3.3 that 

\g{z)\ < max \g(z)\ < — , 

zeD' p 

which means that 

\f{z)\ < y for z £ D' . 

By fixing z within D' and taking the limit of p to R, we get to 
(7.42). 

(ii) If the equality in (7.42) holds at some zq £ D except at the 
origin, we have 
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\g{ zo )l = > \g(z)\ for zeD. 

It follows again from the theorem in Sect. 7.3.3 that g(z ) must be 
constant within D. Hence, we have 


g(z) 



with |c| = 1. 


This reduces to the desired result (7.43). X 

2. Let f(z ) be analytic on a domain D and f(z) ^ 0. Show that if f(a) = 0 
with a G D, then it is always possible to find small p > 0 such that 


0 < \z - a\ < p => f(z) 0. 

This means that zeros of f(z) are necessarily isolated from each other. 

Solution: Suppose that z = a is an nth zero of f(z). From the 

definition of zero of a complex function, there exists an n G TV 
such that 


p<n =>• /fo^(a) ^ 0 and f ( - p \a) = 0. 
Hence, the Taylor series of f(z) around z = a reads 

f(z) = £ ~^r(z - a)" +p = {z- a) n g n (z ), 


p—0 


(n + p)l 


where 


“ f(n+p)(a) 

gn(z) = E / , U ( Z - a ) P : S0 gn{a) = ^ 0. 

(n + p)! n\ 

p—0 x 7 

Since g n {z ) is analytic at a, it is continuous there. Thus we can 
find p > 0 such that 


\z-a\<p => \g n (z) - g n {a)\ < 


1 |/ (n) (o) 


It follows from the triangular inequality that 

, , 1 |/ (n) ( a )| 1 |/ ( " ) ( a )| 

\gn(z)\ > \gn(a)\ - - 1 — . — 1 = - J — , — 


> 0 . 


This implies that for our choice of p, 

0 < \z — a\ < p => f(z) = {z - a) n g n (z) ^ 0. X 
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3. Obtain an alternative form of Cauchy’s integral formula expressed by 

R 2 - r 2 f 2n f(re ie ) 


f(z) = five *) = 


27t J 0 R 2 — 2rRcos(9 — </>) + r 


;d6 


that is valid for |z| < R if f(z ) is analytic for \z\ < R. This is called 

Poisson’s integral formula. 


Solution: Consider the function 

* 


9(0 = 


K 2 - z*C 


/(C), 


which is analytic for |C| < R. Hence, for the contour C : |C| = R, 
we have <f c g(Od C = 0. Furthermore, Cauchy’s integral formula 
tells us that / c /(C)/(C — z )dC, = 0. From these two results, we 
obtain 


1 

2?r i j c 


( — z R 2 -z*( 


new = 


R 2 - \z | 2 

27 ri 


/(C) 

O (C,-z){R 2 -z* C) 


Setting z = re* 1 ^ and £ = Re l6 , we have 

(C - z)(R 2 - z* C) = (. Re ie - re 1 *) ( R 2 - re~ l(t, Re i0 * ) 

— R 2 e 10 [R 2 — 2rRcos(9 — </>) + r 2 ] . 

Substituting in (7.44), we arrive at the desired formula. £ 


d( = 0. 
(7.44) 


7.4 Series Representations 

7.4.1 Circle of Convergence 

We now turn to a very important notion: series representations of complex 
analytic functions. To begin with, we note (without proof) that most of the 
definitions and theorems in connection with the convergence of series of real 
numbers and real functions presented in Chap. 2 and 3 can be applied to 
complex counterparts with little or no change. Here we give a basic theo- 
rem regarding the convergence property of infinite power series consisting of 
complex numbers. 

4 Theorem: 

If the power series 

OO 

^ a n z n (7.45) 

n — 0 

converges at z = zo yf 0, then it converges absolutely at every point of 

1^1 < \zo\ and, furthermore, it converges uniformly for \z\ < p where 0 < 

p < N- 
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Proof We first prove the statement regarding absolute convergence. From hy- 
pothesis, we see that the series a n z o converges. We set 

n 

S n = yiafcZp, 
k—0 

to obtain 

\s n - s n _i| = \a n ZQ I — *■ 0 (n — > oo). 

Hence, there exists an integer M > 0 that satisfies 


which implies 


|a„4| < M for all n, 


X] k^i = E Ml 

n— 0 n— 0 


Z 

Zo 


< M E 

71=0 


z 

^0 


Therefore, if |z| < |^o I, the right-hand side converges so that the series (7.45) 
converges absolutely. 

Next we consider uniform convergence. For every 2 satisfying the relation 
\z\ < p < |zo|, we have 


OO OO ^ 

Ei«”i<«£,vp 

n- 0 n= 0 ' 

since 0 < p/\zo\ < 1. In view of the Weierstrass M- test, we conclude that the 
series (7.45) converges uniformly on the region of \z\ < p. X 

This theorem states that converging behavior of power series 

OO 

E ( 7 - 46 ) 

71=0 

can be classified into the following three types: 

1. It converges at all z. 

2. It converges (ordinary and thus absolutely) at \z\ < R, but diverges at 
1 2 1 > R, in which the real constant R depends on the feature of the series. 

3. It diverges at all z except the origin. 

This classification leads us to introduce the concept of radius of conver- 
gence R of the power series (7.46). For the above three cases, it becomes 

1. R = 0, 2. R itself, 3. R = oo, 

respectively. The circle C with the radius R about the origin is called the 
circle of convergence associated with the series. Note that just on C, con- 
verging behavior of the corresponding series is inconclusive — it may or may 
not converge. 



7.4 Series Representations 215 


The following theorems provide us with a clue for finding the radius of 
convergence of a given power series. 


4 Theorems: 

Given a power series Y^Lo a nZ n , its radius of convergence R equals 


(i) R = lim 


(ii) R = ~ 


dr) 


d n +l 

1 


, if the limit exists; 


lim sup,, 


7.4.2 Singularity on the Radius of Convergence 

Given a complex-valued power series, the convergence criterion based on the 
radius of convergence discussed in the previous subsection does not provide 
us with any information about the convergence property of the series just on 
the circle of convergence. We present below two important theorems regarding 
the latter point. 

4 Theorem: 

If the power series a nZ n has a radius of convergence R , then it 

has at least one singularity on the circle \z\ = R. 


Proof Set 

OO 

f ( z ) = X! a "" U 

n — 0 

If f(z) were analytic at every point on the circle of convergence, then for each 
x with \z\ = R , there would exist some maximal e Zo such that f(z) could be 
continued analytically to a circular region \z — zo\ < e Zo where z o is located 
on the circle \z\ = R. (See Sect. 8.3 for details of analytic continuation.) 
Here e Zo would depend on zo and we define 

e = min e Zn >0. 

\z 0 \=R 

By performing continuations successfully for all possible we obtain a func- 
tion g(z) that is analytic for \z\ < R + e. Clearly for \z\ < R, g must be 
identical to /. In addition, g must have a power series representation, 

OO 

g( z ) = '^2 b nZ n , (7.47) 

n — 0 

that is convergent for \z\ < R + e. Yet since for \z\ < R 
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g(z) = f(z ) = ^ ~2a n z n , 

n — 0 


we conclude that 


6ri 


This implies that the radius of convergence of (7.47) would be R, which clearly 
gives us a contradiction. We thus conclude that f(z) has at least one singu- 
larity on the circle \z\ = R. A 


In general, it is difficult to determine when a function has a singularity at a 
particular point on the circle of convergence of its power series. The following 
theorem is one of the few results we have in this direction. 


6 Theorem: 

Suppose that a power series a n zn has a radius of convergence 

R < oo and that a n > 0 for all n. Then the series has a singularity at 
z = R on the real axis. 


Proof By the previous theorem, the function 

OO 

f{z) = Y a nZ n 

n — 0 


has a singularity at some point Re la . If we consider the power series for / 
about a point pe la with 0 < p < R, we have 


f(z) = Y bn ( z ~ p eia ) = 


f {n \pe ia ) 


n — 0 


71=0 


n\ 


(■ z-pe ia Y 


where the radius of convergence is R — p. (If it were larger, the power series 
would define an analytic continuation of / beyond Re la .) Note, however, 
that for any nonnegative integer j, the derivative reads 


f b \pe la ) = ]T]n(n — 1) •••(« — j + l)(i n (pe la ) n 3 . 


n=3 


Since a n > 0, we have 


f {j \pe ia ) <f U \p). 


This implies that the power series representation of / around, z = p, expressed 

by 

ft 1 V'' ( \n 

fW = -^T- ( z ~ P) 

71 = 0 
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must have a radius of convergence R — p. On the other hand, if / were analytic 
at z = f?, the above power series would converge on a disc of radius greater 
than R — p. Therefore, / is singular at z = R. Jit 


7.4.3 Taylor Series 


Below is the one of the main theorems of this section, which states that any 
analytic function can be expanded into a power series around its analytic 
point. 


4 Taylor series expansion: 

If f(z) is analytic within and on the circle C of radius r around z = a, 
then there exists a unique and uniformly convergent series in powers of 
(z - a), 

OO 

f( z ) = '52 c k(z-a) k (\z — a| < r), (7.48) 

fc= o 


with 



The largest circle C for which the power series (7.48) converges is called the 
circle of convergence of the power series and its radius is called the radius 
of convergence. 


Proof Let f(z) be analytic within and on a closed contour C. From Cauchy’s 
integral formula, we have 


f(a + h) 



/(*) 
z — a — 



(7.49) 


where a is inside a contour C. The contour is taken to be a circle about a, 
inasmuch as the region of convergence of the resulting series is circular. We 
employ the identity 



h 2 

(z~a ) 2 


h N ~ 1 1 Sz-a-h\ h N 

(z-a)^- 1 ] V z-a )~ ~ (z - a) N 


to obtain the exact expression 


1 

z — a — h 


N - 1 


E 

n = 0 


h n 

_(z - a) n+1 


h N 

(z — a — h,)(z — a) N 


Substituting this into (7.49), we have 


f(a+h) 


E 

27ri 


fcYz 


m 

a) N (z — a — h) 


dz. (7.50) 
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Since the first integral can be replaced by the ?rtlr derivative of f at z = a, 
we have 

n-i h n 

f{a + h)=J2 / (n) (a) + R n, (7-51) 

n\ 

n — 0 

where, Rn is the second term on the right-hand side of (7.50). It follows from 
(7.51) that if limjv-uxi Rn = 0, the Taylor series expansion of f(z ) around 
z = a is obtained successfully. This is indeed the case. As f(z) is analytic 
within and on the contour C, the absolute value of Rn is bounded as 


\Rn\ 


X 


m 


27 n J c (z — a) N (z — a — h) 


dz 


— r N 


\h\ N Mr 


r N (r — |/i|) ! 


(7.52) 


where r is the radius of the circle and M is the maximum value of |/| on the 
contour. Within the radius r, \h\ < r so that 


lim Rn = 0. 

N—>oo 

Hence, we have 

7 yi 

f{a + h) = Y J 1 jf {n \a), (7.53) 

i nl 

n= 0 

which holds at any point z = a + h within the circle of radius r. X 

We note that the series (7.53) converges for large h as far as \h\ < r c , since 
Rn vanishes as N — > oo for any value of \h\ smaller than r c . Furthermore, as 
the inequality (7.52) holds whenever f(z) is analytic within and on the circle 
of radius r c , the radius of convergence, r, can extend up to the singularity 
is nearest neighbor to z = a. When the extending circle goes beyond the 
nearest singular point, the inequality becomes invalid so that the Taylor series 
expansion fails. 


7.4.4 Apparent Paradoxes 

We have seen that the radius of convergence is determined by the distance to 
the nearest singularity. Interestingly, this explains some apparent paradoxes 
that which occur if we restrict our attention only to values of the series along 
the real axis of z. 

A familiar example is the Taylor expansion of f(z) = 1/(1 — z) around the 
origin: 

— = 1 + * + z 2 + • • • . (7.54) 

1 — z 

Obviously, both sides of (7.54) “blow up” at z = 1. At z = —1, on the other 
hand, the right-hand side diverges, whereas the left-hand side has a finite 
value of 1/2. Notably, this apparent paradox occurs at all points represented 
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by z = e i.e., at any point on a unit circle surrounding the origin. The 
reason for this is clear from the point of view of the radius of convergence. 
(We leave it to the reader.) 

Another example is 

m = e ~ 1/z2 . 

Observe that f^ n \ 0) = 0 for any n = 0,1,---, so if one puts this result 
blindly into the Taylor formula around z = 0, one obtains apparent nonsense 
as e -1 / z = 0. The point here is that z = 0 is a singularity, where the Taylor 
series expansion is prohibited. 

These two examples suggest the importance of realizing the difference be- 
tween the series representing a function and “the function itself.” A power 
series, such as a Taylor series, has only a limited range of representation char- 
acterized by the radius of convergence. Beyond this range, the power series 
is unable to represent the function. For example, the function considered in 
(7.54), 

/(*) = 7— — ! (7-55) 

l — z 

exists and is analytic everywhere except at z = 1, but its power series around 
z = 0, given by 

1 + z + z 2 -I 

exists and represents f only within the unit circle centered at the origin (i.e., 
\z\ < 1). The region in which a power series reproduces its original function is 
dependent on the explicit form of the series expansion. In fact, an alternative 
series expansion of (7.55) around z = 3 is given by 

-- + -(2: - 3) - -(z - 3) 2 -I , 

which exists and represents (7.55) only within the circle of radius 2 centered 
at z = 3. We thus conclude that power series (including Taylor’s, Laurent’s, 
and others) are not regarded as pieces of a versatile mold by means of which 
one can cast a copy of the function. Each piece of the mold can reproduce the 
behavior of / only within the region where the series converges, but gives no 
indication of the shape of / beyond its range. 


7.4.5 Laurent Series 

When expanding a function f(z) around its singular point z = a, Taylor’s 
expansion is obviously not suitable but we can obtain an alternative expansion 
that is valid for a singular point. The latter kind of expansion is called a 
Laurent series expansion. Laurent series enter quite often in mathematical 
analyses of physical problems, where functions to be considered have a finite 
number of singularities. 



220 7 Complex Functions 


4 Laurent series expansions: 

Let f(z) be analytic within and on a closed contour C except at a point 
z = a enclosed by C. Then, f(z) can be expanded around z = a as 

OO 

f{z)= c n(z-a) n , 

n =— oo 

with the definition 

The series (7.56) with the constants (7.57) is called the Laurent series 
expansion of f(z). 


(7.56) 

(7.57) 




Fig. 7.12. Conversion of a closed contour C into C\ + C 2 so as not to involve the 
singularity of f(z) at z = a in it 


Proof The trick to deriving a, Laurent series expansion is to use the contour 
C 1 + C 2 illustrated in Fig. 7.12 such that its interior does not contain the 
singular point of f(z) at z = a (i.e., / is analytic within and on the contour). 
As is indicated, the original contour C can be reduced to two circular con- 
tours C\ and C 2 encircling z = a counterclockwise and clockwise, respectively. 
Applying Cauchy’s theorem, we have 


f(a + h) = — (j) ' ^ 

2m Jc z — a — h 


dz 


1 

27ri 


/(«) rfs-J- 

q z — a — h 2iri 


c 2 


f(z) 

■ — a — h 


dz. 


(7.58) 
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Note that \z — a\ > \h\ on the contour C\ and \z — a| < \h\ on C 2 . We thus 
have 


1 

z — a — h 

and 


— v = — £ (— X on c 

z — a 1 — z — a \z — a 

z—a n — 0 v 7 


(7.59) 


1 


1 


1 


— a — h h ^-1 


1 OO 


z — a 

h 


on C 2 . 


(7.60) 


^ n — 0 

The substitution of these two expressions into (7.58) yields 


/(a + h) = 


1 

2iti 


<f v . h \ +1 nz)dz+ 1 y j {z ~f )n ~ 1 mdz 

JCi ( z ~ a,) n+1 Tc 2 ^{ h» 


(7.61) 

The order of integration and summation within the square brackets can be 
reversed since the infinite series involved in the integrals converge. Eventually, 
we obtain 


/(a + h) - jr c n h n -, c n - . £ / ( a )l+i dz ■ ( 7 - 62 ) 

Here, the contour for the coefficients c n should be C\ in the positive direction 
for n > 0 and C 2 in the negative direction for n < 0. The series (7.62) is 
what we call the Laurent series expansion of f(z) around the singular point 
z = a. Note that C\ can be taken as the contour for all values of n with the 
reverse direction for negative n’s. This is because the integrand is analytic 
in the region between C\ and C 2 , which allows us to expand the size of the 
contour C 2 until it coincides with the larger contour C\. A 


7.4.6 Regular and Principal Parts 

An important property of Laurent series is the series resolution. To see this, 
we rewrite (7.62) as follows: 


f(a + h) = '^2c n h n + '^2c- n h ”. (7.63) 

n= 0 n=l 

The first term in (7.63) converges everywhere within the outer circle of conver- 
gences, whereas the second term converges anywhere outside the inner circle. 
This means that the Laurent series expansion resolves the original function 
f(z) into two parts: one that is analytic within the outer circle of conver- 
gence, and the other that is analytic outside the inner circle of convergence. 
Obviously, each part is analytic over different portions of the complex plane. 
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The part of the Laurent series consisting of positive powers of h is called 
the regular part. The other part, consisting of negative powers, is called the 
principal part. Either part (or both) may terminate at a finite degree of the 
sum or be identically zero. Particularly when the principal part is identically 
zero, then f(z) is analytic at z = a, and the Laurent series is identical with 
the Taylor series. 

Remark. At first glance, the regular part exhibited in (7.63) resembles the 
Taylor series. However, this is not the case; the ?rth coefficient cannot gen- 
erally be associated with f( n \a) because the latter may not exist. In most 
applications, f(z) is not analytic at z = a. 


7.4.7 Uniqueness of Laurent Series 

Taylor and Laurent series allow us to express an analytic function as a power 
series. For a Taylor series of f(z), the expansion is routine because the coef- 
ficient of its n term is simply f( n \zo)/n\, where zo is the center of the circle 
of convergence. In contrast, for the case of a Laurent series expansion, the 
nth coefficient is not (in general) easy to evaluate. It can usually be found by 
inspection and certain manipulations of other known series, but if we use such 
an intuitive approach to determine the coefficients, we cannot be sure that 
the result we obtain is correct. The following theorem addresses this issue. 

4 Theorem: 

If the series 

OO 

y a n (z - z 0 ) n (7-64) 

n =— oo 

converges to f(z) at all points in some annular region around zo , then it is 
the unique Laurent series expansion of f(z) in that region. 


Proof Multiply both sides of (7.64) by 

1 

2ni(z — zo) k+1 ’ 


integrate the result along a contour C in the annular region, and use the easily 
verifiable fact that 


1 f dz 

2ni f c (z - z 0 ) k ~ n+1 


dkn 


/(*) 

(z - z 0 ) k+1 


to obtain 


1 

27 ri 


= ak- 
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Thus, the coefficient a*, in the power series (7.64) is precisely the coefficient in 
the Laurent series of f(z ) given in (7.57), and the two must be identical. £ 


Remark. A Laurent series is unique only for a specified annulus. In general, a 
function f(z) can possess two or more entirely different Laurent series about 
a given point, valid for different (nonoverlapping) regions; For instance, 


/(*) 


1 

<1 - z) 



z 


111 


0 < |*| < 1 , 

1 < \z\ < 00 . 


7.4.8 Techniques for Laurent Expansion 

The following examples illustrate several useful techniques for the construction 
of Taylor and Laurent series. 


(a) Use of geometric series 

Suppose that a function 

f(z) = — (7.65) 

z — a 

fails to be analytic at z = a. We would like to obtain the Laurent series of 
f(z) around z = a. First we note that for \z\ < |a|, f(z) reads 

1 

z — a 

This is obviously the Taylor series expansion of f(z) around the point z = 0. 
That is, for \z\ < |a|, the Laurent series of f(z) given in (7.65) becomes 
identical to its Taylor series. Nevertheless this is not the case for \z\ > |a|, 
since its radius of convergence is R = |a|. Hence, we should also evaluate the 
Laurent series around z = a that is valid for \z\ > \a\. In a similar manner as 
above, we obtain 


1 


1 


a 1 — ( z/a ) 


oo 


(7.66) 


n=0 


1 


z — a 


oo 

£(!)’ 


n — 0 


°° r .n 

for W > M- 

n — 0 


(7.67) 


Expansions (7.66) and (7.67) both serve as the Laurent series expansions of 
f(z), although the regions of convergence are different from one 
another. 
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Remark. The function f(z) given in (7.65) can be expanded by this method 
about any point z = b: Indeed, write 


m = 



i 

(z — b) — (a — b) 


(b^a). 


Then, either 


or 


fl') = AErl (i* -6| < i«-6|) 

a — b ' (a — b) n 

71=0 V ’ 

OO / 7 \ ^ 

fW = H( z a - b )n + 1 (l«-i»l> |0-&D- 

n= 0 ^ ’ 


(b) Rational fraction decomposition 

Next we assume a function 


^ ^ z 2 — (2 + + 2i 

The roots of the denominator are z = i and z = 2, which are the only points 

at which f(z) fails to be analytic. Hence, f(z) has a Taylor series about z = 0 
that is valid for \z\ < 1 and two Laurent series about z = 0 that are valid for 
1 < \z\ < 2 and \z\ > 2. To obtain them, we use the identities 

z 2 — (2 + i)z + 2i = (z — i)(z — 2) 

and 

^ ^ (z — i){z — 2) 2 — i (yZ — 2 z — 

When we want the Laurent series of f(z) around z = 0 that is valid for 
1 < \z\ < 2, it suffices to expand the function l/(z — 2) in the Taylor series 
about z = 0 [see (a) above] and then expand 1 /(z — i) in the Laurent series 

about z = 0 that is valid for \z\ > 1. (The latter series is also valid for 

1 < |z| < 2.) If these two series are subtracted, we obtain a series for f(z) 
that is valid for 1 < \z\ < 2, which is the desired Laurent series. 


(c) Differentiation 

The method used in (b) fails for functions with a double root in the denomi- 
nator such that 
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Among alternative methods, the simplest one is the differentiation 


1 d_ f 1 

(z — l) 2 dz \1 — 2 

From the discussions regarding the earlier case (a), the function 1/(1 — z) is 
seen to be represented by 


1-2 



z\ > 1. 


n — 0 


Hence, term-by-term differentiations yield 


1 


OO 

E( n + 1)*", M <!. 

n — 0 
oo 

-J2( n + l ) z ~ {n+2) i m > i - 

n — 0 


Exercises 


1. Let f(z) be an entire function. Employ the Taylor series expansion to 
show that the function defined by 


9(z) 


f 0) - /O) 

z — a 

f 0 ), 


, z ^ 
z = a. 


is also entire. 


Solution: For z ^ a, we employ the Taylor series expansion of 

f(z) to obtain 

g(z) = /'(a) + ^T~( z - a ) + ^ - a) 2 + • • • ■ (7-68) 

By the definition of g, the representation (7.68) is valid at z = a. 
Hence, g is equal to an everywhere-convergent power series and is 
thus an entire function. X 

2. If / is entire and if for some integer k > 0 there exist positive constants 
A and B such that 

\f(z)\<A + B\z\ k , 

then / is a polynomial of degree k at most. Prove it 
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Solution: Note that the case k = 0 is the original Liouville 

theorem. To prove the case of k > 0, we employ mathematical 
induction, and consider 


9(z) 


m - /( o) 

/'( O ), 2 


0 , 

z = 0, 


(7.69) 


where f(z) is assumed to obey the conditions noted above. By 
Exercise 1, g is entire. In addition, by hypothesis on / we have 

\g{z)\ <C + D\z\ k ~ 1 . 


Hence, by induction, g is a polynomial of degree fc — 1 at most, 
then / is polynomial of degree k at most owing to the definition 
(7.69). This completes the proof. X 

3. Find the Laurent series of the multivalued logarithmic function given by 
f{z) = log(l + z) = log |1 + z\ + zarg(l + z). 


Solution: The branch cut (see Sect. 8.2.3) is set so as to extend 
from —oo to —1 along the real axis. Hence, log(l + z) is analytic 
within the circle \z\ = 1. Since 


|log(l + ,) 


1 

1 + z' 


we may expand 


1 

1 + z 


l-z + z 2 -z 3 + --- = ^2(-l) n z n (|z|<l). 

n — 0 


Then, term-by-term integration yields 

f Z dt; Z 2 Z 3 .. | . 

where C is the constant of integration. Since logl = 0, it follows 
that C = 0 and 


z 2 z 3 


71=1 


log(l + ^) = ^- V + T- '- - = E(- 1 )” +1 - d*l < !)■ 

) a n 


Other branches of log(l + z) have the same series except for dif- 
ferent values of the constant C. X 
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4. Find the power series representation of f(z ) about z = 0 that satisfies the 
differential equation 


f'{z) + /0) = 0 with /( 0) = 1. (7.70) 

Solution: Let f(z ) = 1 + a nZ n ■ Then we have f(z) = 

E”i nan*"" 1 ' 

= at + £~ =1 (n + l) a rt+i- 2;n - Substitute this into (7.70) to obtain 
1 + aq = 0 and a n + (n + l)a n+ i = 0 for n > 1. 

The latter result yields 

= ( 1) 1 = ( 1) 7 77^n— 2 = * * * = ( 1) rUl- 

n nyn — 1) n! 

Hence, we have a n = (— 1 )"/n!, so that 

f(z) = 1 + ^ = e"*. * 

z ' n\ 


5. Let f(z) = c n{ z — a ) n ana lytic for \z — a\ < R. Prove that 


1 /»27T 

-/ |/(a + re i8 )| £ |c n | 2 r 2 " 

^ "'° n=0 

Then show that 

OO 

5] |c„| 2 r 2 " < M(r) 2 , 

n — 0 


for any r < R. 


(7.71) 


in which M(r) = max| a _ 0 i_ r \f(z)\. The result (7.71) is called Gutzmer’s 
theorem. 

Solution: From assumption, it follows that 


\m \ 2 


^c„(re lS ) 

_n = 0 


c *m (re 18 ) 

m = 0 


00 

\ A * m-\-' 

2_ c n c m r 

71,771=0 


This infinite series converges uniformly on the circle \z — a| = 
r < R, which allows us to interchange the order of integration and 
summation as expressed by 



I f(a + re i0 


\ 2 dd = 


E 

n,m — 0 


f>27T 


CnC^r 


,* m-\-n 


P i (n- rn ) 6 d9. 


The right-hand side vanishes when n ^ m since the integral equals 
zero. Hence, we have 
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p2ir oo 

/ \f(a + re w )\ 2 dd = Y / \ 

n— 0 


|c„|V™ x 2 tt, 


which is equivalent to the desired equation. Furthermore, since 
| /(a + re ld ) \ < M(r ), we have 


E 


1 2 2 7i 

Cn r 


1 r 2n 1 r27r 

7 ^ J \f(a+re e )\ 2 dd < — J M (rf dO = M (r) 2 . 


7.5 Applications in Physics and Engineering 


7.5.1 Fluid Dynamics 

This section demonstrates the effectiveness of using complex function theory 
for analyzing fluid dynamics in a two-dimensional plane. The primary aim is 
to derive the Kutta Joukowski theorem (see Sect. 7.5.2), which describes 
the lift force exerted on a solid material placed in a uniform flow. Before 
proceeding, we introduce terminologies and several basic concepts that pertain 
to fluid dynamics. 

The fundamental quantities that characterize a two-dimensional fluid flow 
are velocity v = ue x + ve y and vorticity u> = V x v, both of which are 
vector- valued functions of the position r. Here, we restrict our attention to 
the case of an irrotational (u> = 0) and incompressible (V • v = 0) fluid. 
The assumption u> = V x v allows us to define an appropriate function ( P{x 1 y) 
such that 

v = V<2>, (7.72) 


since V x (V/) = 0 for any analytic function f(x,y) in the x-y plane. The 
function d>(x, y) defined by (7.72) is called the velocity potential. Further, 
our assumption of V • v = 0 implies that 

du dv 
dx + dy^ ’ 

which in turn suggests the presence of an analytic function 'P{x 1 y) defined by 



dx 


(7.73) 


that satisfies the two-dimensional Laplace equation V 2 >F = 0. Such a function 
!F(a :,y) is called a stream function. 
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Remark. The name stream function originates from the fact that the curves 
of = const, in the x-y plane represent streamline flow. This is shown 

by noting that if dd' — 0, we have 

d'R d'P 

dti/ = -—dx + - 7 —dy = —vdx + udy = 0, 
ox ay 

so that dx/u = dy/v, which implies that dr is parallel to v. 


From (7.72) to (7.73), it follows that the components of the velocity v are 
expressed as 

d<P d'P dd> d'R 

U dx dy ' dy dx 

This allows us to introduce the concept of a complex velocity potential 
f(z) in the complex plane: 


f(z ) = $(z) + i^(z) with z = x + iy. (7.74) 


Note that since f(z) is analytic, 


d£ 

dx 


df_ 

dz 


= u — tv = \v\e 


-i6 


i.e., the absolute value of the derivative \df/dz\ gives the magnitude of the 
velocity |u|. Furthermore, the contour integral of f(z) has important physical 
implications. Given a closed contour C placed on a two-dimensional flow, we 
have 


df = r(C) + iQ(C), 


where 


no 



<j> ( udx + vdy) 


C 


v ■ dr , 


Q(C) 



j> (' udy — vdx) 


j) \v x dr \ . 


Hence, the integrals r(C) and Q(C) represent the circulation (or rotation) 
and the fluid flow, respectively. 


7.5.2 Kutta Joukowski Theorem 

We are now ready to study the Kutta- Joukowski theorem, which describes 
the lift force in a two-dimensional flow. The lift force is a component of the 
fluid dynamic force that is perpendicular to the flow direction, ft is the lift 
force that makes it possible for airplanes, helicopters, sail boats, etc. to move 
against the gravitational force or water currents. 
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Fig. 7.13. Spatial configuration of material placed into a two-dimensional uniform 
flow with speed U showing the components F x , F y of the flow-induced force F acting 
on the material 


4k Kutta Joukowski theorem: 

The lift force F y that acts on a material placed in a uniform flow U in 
the x-direction is given by 

F y = -pUr(C), (7.75) 

where p and T(C') are the mass density and the circulation of the fluid, 
respectively, within a closed contour C surrounding the material (see 
Fig. 7.13). 


The lift force is generated in accordance with Bernoulli’s theorem and the 
law of conservation of momentum. Both of these principles are used to 
explain the mechanism responsible for the occurrence of the lift force in a 
uniform flow, which is given by the Blasius formula (see 7.5.3): 

F = ^ <f w 2 dz , 

2 Jc 

which plays a key role in the proof of the Kutta-Joukowski theorem, as shown 
below. 


Proof (of the Kutta-Joukowski theorem). Assume a uniform flow oriented to 
the 2 -axis. Then the function w = df /dz is analytic and satisfies the relation 

lim w = U = const. 

2 — »00 


Hence, w can be expanded at points sufficiently far from the origin: 

(~ oo), 


df k 0 ci 2 c 2 

w = — = U d o 5 


dz 


(7.76) 
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which implies that 

/ = Uz + k 0 \ogz + c 0 + H (z — > oo) (7.77) 

and 

w 2 = U 2 + ^'° + (fcp ~ 2£7 Ci) -I (z-+oc). (7.78) 

From (7.77) we have 

fdz = 2irik 0 = r(C) + iQ(C), (7.79) 

and substituting (7.78) into the Blasius formula expressed by F = (ip/2) 
f c w 2 dz, we obtain 

F = F x + iF y = j- 2m- 2Uk 0 = -2tt pUk 0 . (7.80) 

Combining (7.79) and (7.80) yields 

F x + iF y = pU(—Q + ir ), 


i.e., 


F x = -pUQ, F y = -pUr. (7.81) 

of the two results above, it is the second one regarding F y that states the 
theorem, ft 

Remark. The first equation in (7.81) indicates that F x = 0 if Q = 0; i.e. no 
force in the direction of the stream is relevant to a material inside the closed 
contour C if no source is located interior to C. This is precisely the case for 
an ideal flow without any viscosity. 


7.5.3 Blasius Formula 

We conclude this section by explaining the Blasius formula, which is impor- 
tant for the proof of the Kutta-Joukowski theorem discussed above. Consider 
a two-dimensional flow of irrotational and incompressible fluid and assume 
that a solid material is placed inside a closed contour C encircling a portion 
of the fluid. Apparently, a force F from the flow is exerted on the material. 
Hence, the law of the conservation of momentum within the contour C 
is written as 

F + j dG = 0, 

where dG represents the sum of momentums that pass through a line element 
ds of the closed contour C per unit time. It is given by 
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dG = pnds + pvv n ds, (7.82) 

where p is the fluid pressure, n is a basis vector normal to the contour C, p 
is the density of the fluid, and v n = v ■ n. The first and second terms on the 
right-hand side of (7.82) represent the impulse transmitted to the interior of 
C through ds and the volume of fluid passing through ds, respectively. Using 
the stream potential we rewrite as (7.82) 

dG = pnds + pvd'P, (7.83) 


since dF = v n ds. 

In order to obtain the complex-number representation of (7.83), we denote 
by dz an infinitesimal vector having length ds and a direction normal to n. 
We then have 

dz = i(n x + in y )ds. 

when we apply this relation to (7.82), the quantity dG is expressed as 

dG x + idGy = -ipdz + P ^ • df ~J f * , (7.84) 

where we consider d'l' to be the imaginary part of df. The pressure p is known 
to correlate with / via Bernoulli’s theorem, which is expressed by 


P = Po 


P df_ 2 
2 dz 


_pVdr 

Po 2 dz dz * 


(7.85) 


where po is the pressure at a position far from the material (i.e., z — » oo). It 
then follows from (7.84) to (7.85) that 


dG x + idGy 


ip df df* , 
- iPo d S +— — — rf* 


ip df* 
2 dz* 


— ipodz 


ip 

2 



df , df* , „ 

-f-dz 7 — dz 

dz dz* 


(7.86) 

(7.87) 


Since § c dz = 0, we finally obtain 


F — F x + iF y — 




(7.88) 


which is known as the Blasius formula. 



8 


Singularity and Continuation 


Abstract We devote the first half of this chapter to the essential properties and 
classification of singularities, which are nonanalytic points in a complex plane. We 
then describe analytic continuation, which is a most important concept from a the- 
oretical as well as an applied point of view. Through analytic continuations, we 
observe the interesting fact that the functional form of a complex function may 
undergo various changes depending on the defining region in the complex plane. 


8.1 Singularity 

8.1.1 Isolated Singularities 

A singularity of a complex function f(z) is any point where it is not analytic. 
In particular, the point z = a is called an isolated singularity if and only 
if f(z) is analytic in some neighborhood but not at z = a. Most singularities 
we have encountered so far in this text were isolated singularities. However, 
we will see later that there are singularities that are not isolated. 

When z = a is an isolated singularity of f(z), it is classified as follows: 

1. A removable singularity if and only if /(z) is finite throughout a neigh- 
borhood of z = a, except possibly at z = a itself. 

2. A pole of order m (m = 1, 2, • • • ) if and only if (z — a) m f(z ) but not 
(z — a) m ~ 1 f(z) is analytic at z = a. In this case, lim z ^ a \f(z)\ = oo no 
matter how z approaches z = a. 

3. An essential singularity if and only if the Laurent series of f(z) around 
z = a has an infinite number of terms involving negative powers of (z — a). 

Remark. There is an alternative definition of a pole: the point z = a is a pole 
of mth order of f(z) if and only if 1 //(z) is analytic and has a zero of order 
m at z = a. 
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The three types of isolated singularities described above can be distinguished 
by the degree of expansion of the Laurent series of f(z) being considered. Let 
f(z) have an isolated singularity at z = a. Then there is a real number S > 0 
such that f(z) is analytic for 0 < \z — a\ < S but not for z = a, which means 
that f(z ) can be represented by the Laurent series 

oo M 

/(*) = J2 c "(* - + S c ~ n ( z _ a )" ' (8A) 

n — 0 n — 0 ' 

Thus, it suffices to examine the expansion degree M of the principal part, the 
second sum in (8.1), in order to determine the type of the isolated singularity 
z = a. 

Case 1. Removable singularities ( M — 0) 

In this case, the principal part is absent so that the Laurent series around 
z = a reads 

f(z) = c 0 + ci(z - a) + c 2 (z - a) 2 + ■ ■ ■ (z ^ a). 

Observe that lim^ a f(z) = cq as is consistent with statement 1 above, which 
says that f(z) is finite in a neighborhood of z = a. This kind of singularity 
can be eliminated by redefining /(a) as Co, which is why we call it removable. 

Examples Consider the function 

f{z) = — • ( 8 . 2 ) 

z 

This yields lirn^o f(z ) = 1, but the value of /( 0) is not defined. Hence, z = 0 
is a removable singularity of (8.2). In a similar sense, the functions 

e sinz/z and — 

2 tan 2 

are regarded as analytic at 2 = 0, since this point is the removable singularity 
for each. 

Case 2. Isolated poles (AT is finite) 

The second type of isolated singularity, for which the principal part reads 

M 

c— n h~ n ( c-m 0, M > 1), 

n = 1 

is called a pole of order M. Order M is the minimum of the integer that 
makes the quantity 

lim (2 - z 0 ) M f(z) 

Z^Zq 

a finite, nonzero complex number. 
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Examples 1. The function f(z) = l/sinz has Laurent series valid for 0 < 
M < TL 


1 1 z 7 , 31 r 

~ — - + X + XXX 2 + i r 1 on 2 + " ' ’ 

smz 2 6 360 15120 

from which it follows that it has a simple pole at the origin. 

2. The function f(z) = 1/z has a simple pole at z = 0, which is easily seen 
by noting that lim z ^o zf(z) = 1. 


Case 3: Essential singularities ( M — oo) 


The third type of isolated singularity, essential singularity, gives rise to an 
infinite principal part. 

Examples The function f(z) = e 1 ^ has the Laurent series 


A/* 


1 | i | 1 

2 + 2!^2 + 3!^3 + "' ’ 


which is valid for \z\ > 0. Since the principal part is infinite, the function has 
essential singularity at 2 = 0. 


Remark. An infinite principal part in the Laurent series implies essential 
singularity only when the series is valid for all points in a neighborhood | z — 
a\ < £ except z = a. For example, the series 

1 1 1 
f(Z) ~ (z - l) 2 + (z - l) 3 + (z - l) 4 + ' " 

does not mean that z = 1 is an essential singularity of f(z), since the series 
converges only if \z— 1| > 1. It actually represents the function f(z) = l/(z 2 — 
3 2 + 2) in the annulus 1 < \z — 1| < i?, which evidently has a simple pole at 
z = l. 


8.1.2 Nonisolated Singularities 

As noted earlier, there are other kinds of singular points that are neither 
poles nor essential singularities. For example, neither y/z nor log z can be 
expanded near 2 = 0 in Laurent series; both of them are discontinuous along 
an entire line (say, the negative real axis) so that the singular point 2 = 0 is 
not isolated. Singularities of this kind, called branch points, are discussed 
in the next subsection. 

Another type of singular behavior of an analytic function occurs when it 
possesses an infinite number of isolated singularities converging to some limit 
point. Consider, for instance, 
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sin(l/*)' 

The denominator has simple zeros whenever 

z = — (n = ±1. ±2, • • • ). 
nn 

The function f(z) has simple poles at these points and the sequence of these 
poles converges toward the origin. The origin cannot be regarded as an isolated 
singularity because every one of its neighborhoods contains at least one pole 
(actually an infinite number of poles). 


8.1.3 Weierstrass Theorem for Essential Singularities 

The behavior of a function in the neighborhood of an isolated essential sin- 
gularity is different from the cases of other isolated singularities such as poles 
and removable singularities. Most remarkable is the fact that f(z) can be 
made to take any arbitrary complex value by choosing an appropriate path 
of z — > a. For instance, if z approaches zero along the negative real semiaxis, 
then the function f(z) = e 1 / 2 yields \f(z) \ — > 0. However, if 2 approaches zero 
along the positive real semiaxis, then |/(z)| — ► oo. Finally, if z approaches 
zero along the imaginary axis, then |/(,j)| remains constant but arg f(z) os- 
cillates, and so on. The character of a function near an essential singularity is 
described by the following theorem: 

Weierstrass theorem: 

In any neighborhood of an isolated essential singularity, an analytic 
function approaches any given value arbitrarily closely. 


Proof We use the contraposition method to prove our theorem. Let z = a 
be an isolated essential singularity of f(z). We assume for the moment that 
for \z — a\ < e, | f{z) — y| with a given complex number 7 does not become 
arbitrarily small. Then, the function [f(z) — y]^ 1 is bounded in the region of 
\z — a | < e so that it is possible to find a constant M such that 


1 

f(z) - 7 


< M for \z 


a| < s. 


Hence, [f(z) — 7] 1 is analytic for \z — a\ < £ (or at worst has a removable 
singularity) and can be expanded by 


f(z) - 7 


6 0 + h(z - a) + b 2 {z - a ) 2 H . 


(8.3) 
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If b 0 ^ 0, then 

lim = 6o so that lim f(z) = 7 + 

z ^ a }{z )~7 z ^° o 0 

This means that 2 = a is not a singularity of /(s), which contradicts our 
assumption. Otherwise, if bo = 0, we have 

7+ (2: - a) fc [b k + b k +i(z - a) H ]’ 

where bk is the first nonzero coefficient in the series (8.3). This clearly shows 
that 2 = a is a pole of f(z) of kth degree, which again is inconsistent with 
our assumption. Therefore, we conclude that 1/(2) — y| with a given 7 can be 
arbitrarily small in the vicinity of an essential singularity 2 = a. Furthermore, 
since 7 is arbitrary, the function f(z) approaches any given complex value 
arbitrarily closely, ft 

Remark. The above theorem becomes invalid if the point at infinity is taken 
into account; the point at infinity 2 = 00 is defined as the point 2 that is 
mapped onto the origin 2 = 0 by the transformation 2 = I/2. For instance, the 
function f(z) = e z has an essential singularity at 2 = 00 but never approaches 
zero there. 


8.1.4 Rational Functions 


In comparisons with the previous case, the behavior of an analytic function 
near a pole is easy to describe. We now derive the following result: 

Theorem: 

A rational function has no singularities other than poles. Conversely, 
an analytic function that has no singularities other than poles is neces- 
sarily a rational function. 


A rational function f(z) is of the form 


m 


p(z) 

Q(*y 


(8.4) 


where 


p(z) = a 0 + ol\Z + a 2 2 2 + • • • + a„2 n 

and 

q(z) = (3 q + pi z + P2Z 2 + • • • + p m z m . 

Observe that the polynomials p(z) and <7(2) are analytic at all finite points on 
the complex plane. 
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Proof In what follows, we assume that p(z) and q{z) have no common zeros; 
if they do have a common zero at z = Zo, it is always possible to write f{z) in 
(8.4) as the quotient of two polynomials with no common zeros by canceling 
a suitable number of the (z — £o)-factors. 

Obviously, the only possible singularities of f(z) are situated at the zeros 
of q{z). Since the zeros of p{z) do not coincide with those of q{z), f(z ) neces- 
sarily diverges at the zeros of q{z). Such points can be poles but not essential 
singularities in view of the Weierstrass theorem given in Sect. 8.1.3. We have 
thus proved that all singularities of rational functions f(z) are necessarily 
poles. 

To prove the converse, suppose that all the singularities of an analytic 
function f(z) are poles at the points cq, 02, • • • , a n . The orders of these poles 
are denoted by mi, m2, • • • , m n , respectively. In the vicinity of the point a„, 
the function f(z) has a Laurent series expansion of the form 


/(*) = 


(z - a u ) r 


(«/) 

C-l 

(z - a v ) 


OO 

(i—O 


(z - a v y 


where the superscripts (v) on indicate that they are the coefficients that 
belong to the z/th poles, 2 = a„. Denote the principal part by 


9u{z) 



(z - a„) m - 



(z - a„) 


(8.5) 


and consider the expression 


h(z) = f(z ) - g x (z ) - g 2 (z) g n {z). 

Since f(z) — g v (z) is analytic at z = a ui and g v (z) is analytic everywhere 
except at z = a u , it follows that h(z) is analytic at all points of the complex 
plane, including the point at infinity. In view of Liouville’s theorem such a 
function is necessarily a constant. Thus we have identically h(z) = 70, whence 

n 

f(z) = 70 + ^2,g v {z), (8.6) 

y=i 

which implies that f(z) can be brought into the form (8.4). This completes 
the proof of our theorem. X 


Exercises 


1. Find the poles and their order of the following functions: 

f s tr \ sin ( 2 + !) n ^ sin z 

(a) f(z) = -3 , (b) f{z) = —3-. 
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Solution: (a) Clearly, lining z 2 f(z) = oo and lining z 3 f(z) = 

sin(l) y 0. Hence, / has a third-order pole at z = 0 arising from 
the factor 1 / z 3 . (b) Since lining z 3 fi z ) = 0, the pole of f(z) is 
not a third-order pole. Instead, noting the asymptotic behavior of 
sin z near z = 0, we obtain 

r 2// \ 2 z - (^ 3 /3!) -I 

Inn 2 f(z) = lim 2 = 1 . 

z— >0 z— >0 Z 3 

Hence, f(z) has a second-order pole at 2 = 0. X 

2 . Show that a function f(z) cannot be bounded in the neighborhood of 
its isolated singular point z = a. 

Solution: Use the contraposition method; if \f(z)\ < M for 

\z-a\< r, then the expansion coefficients read 


C— n — 




where C is the circle given by \z — a\ = 
as small as desired, we have 


< Mr n for any n, 
r. Since r may be taken 


c — i = c_ 2 = • • • = 0, 


which means that the Laurent series reduces to a Taylor series. 
Hence, f(z ) should be analytic at 2 = a, which contradicts the 
assumption that z = a is a singular point. X 


3. Let both f{z) and g(z) be analytic in the vicinity of 2 = a and have a 
zero of mth order at z = a. Prove that 


f(z) _ fM(a) 
z™ a g(z) g( m \a)' 


(8.7) 


This result is called l’Hopital’s rule. 

Solution: In the vicinity of z = a, we have 


m=(z 


ay 


f {m) (a) 


+ (z 


v / (m+1) (q) 
(to + 1)! 


+ (* 


^/ (m+2) (q) 

a> (m + 2)! 


and we also have a form similar to g{z). These expressions imme- 
diately yield the desired equation (8.7). X 

4. Prove that if f(z) has an essential singularity at z = a, 1 /f(z) also 
has an essential singularity. 

Solution: Suppose that / has an essential singularity at z = a 

but that 1// does not. If this is true, 1// will at most have a pole 
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there (of order N, for instance) and is expressed in terms of the 
series as 

1 OO 

7 = E W>". 

J n=-N 

Rewrite this to obtain 

/ = h - 

J V°° h Krh 171 ' 

2-jm - 0 0 m-N<l 

Note that the denominator ^ b m -Nh m is analytic within C\, and 
thus the fraction 1/Y2 b m _ n h m is as well. As a result, the function 
/ would be expanded into a power series in h starting with h N \ 
this result contradicts our assumption that f(z) has an essential 
singularity at z = a. Therefore, wherever f(z) has an essential 
singularity, 1// also necessarily has one. £ 

Remark. The above result sounds intriguing when compared with the behavior 
of an f(z ) that has a pole. If f(z) has a pole of order TV at z = a, 1// obviously 
has no pole but does have a zero of order N; i.e., 1// oc (z — a) N . 

8.2 Multivaluedness 

8.2.1 Multivalued Functions 

Up to this point, our concern has been limited to single- valued functions, i.e., 
functions whose values are uniquely specified once 2 is given. When we con- 
sider multivalued functions, many important theorems must be reformulated. 

The necessary concepts are best illustrated by considering the behavior of 
the function f(z) = z 1 ^ 2 in a graphical manner. Figure 8.1 gives a contour of a 



Fig. 8.1. Mapping of a circle on the 2 -plane onto an upper-half circle on the w-plane 
through f(z) = 2 1/2 
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unit circle a —> b on the z-plane. Through the transformation w = f(z) = z 1 / 2 , 
the circle is mapped onto a semicircle A — > B on the tu-plane such that 


z = 1 — > ty = 1, 



Of importance is the fact that the images of the points a and b, i.e., A and B, 
respectively, are not equal but are distinct on the tc-plane. This suggests that 
the value of z 1 / 2 for z = 1 is not uniquely determined. Furthermore, a similar 
phenomenon occurs for any circular contour a — > b with an arbitrarily large 
(or small) radius. We thus see that the function /(z) = z 1 / 2 is multivalued, 
at least along the positive real axis; one point on the positive real axis of the 
z-plane is associated with two distinct points on the wj-plane. 

As a matter of fact, the multivaluedness of the function /(z) = z 1 / 2 noted 
above occurs at all points on the whole z-plane (except at the origin). To see 
this, we observe again that the circular contour a —> b may have any radius. 
As a result, all the points on the z-plane are correlated with only half of the 
points on the tc-plane, those for which Im [u>] = v > 0. The remaining values 
of w are generated if a second circuit a — > b is made. Namely, the values of 
w with v < 0 will be correlated with those values of z whose arguments lie 
between 27t and 47t. As a consequence, all values for z 1 / 2 represented by on 
the tc-plane may be divided into two independent sets: the set of values of w 
generated on the first circuit of the z-plane 0 < <j> < 2 tt and those generated 
on the second circuit 2 tt < (f> < 47t. These two independent sets of values for 
z 1 / 2 are called the branches of z 1 / 2 . 

The concept of branch allows us to apply the theory of analytic functions 
to many-valued functions, where each branch is defined as a single-valued 
continuous function throughout its region of definition. 

8.2.2 Riemann Surfaces 

For the case z 1 / 2 , the notion that the regions 0 < <f> < 27t and 2tt < (f> < Att 
correspond to two different regions of the tc-plane is awkward geometrically, 
since each of these two regions covers the z-plane completely. To re-establish 
the single- valuedness and continuity of /(z), it is desirable to give separate 
geometric meanings to two z-plane regions. This is achieved through the use 
of the notion of Riemann surfaces. 

A Riemann surface is an ingenious device for representing both branches 
by means of a single continuous mapping. Suppose that two separate z-planes 
are cut along the positive real semiaxis from +oo to 0 (see Fig. 8.2), and that 
the planes are superimposed on each other but retain their separate identities. 
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Fig. 8.2. A Riemann surface composed of two separated z-planes 


Now suppose that the first quadrant of the upper sheet is joined along the 
cut to the fourth quadrant of the lower sheet to form a continuous surface. It 
is now possible to start a curve C in the first quadrant of the upper sheet, go 
around the origin, and cross the positive real semiaxis into the first quadrant 
of the lower sheet in a continuous motion. The curve can be continued on the 
lower sheet around the origin into the first quadrant of the lower sheet. This 
process of cutting and cross-joining two planes leads to the formation of a 
Riemann surface, which is thought of as a single continuous surface formed of 
two Riemann sheets. 

Several important remarks are in order. 

1 . According to this model, the positive real semiaxis appears as a line where 
all four edges of our cuts meet. However, the Riemann surface has no 
such property. This results in the line between the first quadrant of the 
upper sheet and the fourth quadrant of the lower sheet being considered 
distinct from the line between the first quadrant of the lower sheet and the 
fourth quadrant of the upper one. There are two real positive semiaxes on 
the Riemann surface just as there are two real negative semiaxes. Hence, 
the entire Riemann surface is mapped one-to-one onto the w-plane. (The 
origin z = 0 belongs to neither branch since the polar angle 9 is not defined 
for z = 0.) 


2. The splitting of a multivalued function into branches is arbitrary to a 
great extent. For instance, we can define the following two functions, both 
of which may be treated as branches of f(z ) = \J~z\ 


BranclrA : f a(z ) 

BranchB : f B (z) 


■y/re* 61 / 2 for 0 < 9 < n, 

y/'i F e *(0+2-n-)/2 f or _ 7r < Q < o. 


y / re*( e+ 2' ,T ')/2 for 0 < 9 < n, 

yAe* 61 / 2 f or —tt < 0 < 0. 


Note that branch A is continuous on the negative real semiaxis but is 
discontinuous on the positive real semiaxis (so is branch B). These two 
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branches together, constitute, the double- valued function f(z ) = y/z, and 
this representation is no better and no worse than the previous one. 

3. The above-mentioned technique can be extended to other multivalued 
functions that require more than two Riemann sheets (for instance, f(z) = 
yfz requires three). There are functions requiring an infinite number of 
Riemann sheets, such as f(z) = z a with an irrational a. 


8.2.3 Branch Point and Branch Cut 

We so back to the behavior of the multivalued function w = f(z) = z 1 / 2 to 
introduce other important concepts referred to as branch point and branch 
cut. Let us consider a certain closed curve C without self-intersections in the 
2-plane. Specify a point z o to which we assign a definite value of the argument 
Oq- Through the mapping w = z 1 / 2 , we will find two distinct points: wo(zo) 
and u»i(zo). 

In what follows, we examine the variation of the functions 'Wq(z) and 101(2) 
as the point z moves continuously along the curve C. Since the argument of 
the point 2 on the curve C varies continuously, the functions wq(z ) and w\{z) 
are continuous functions of z on the curve C . 

Here, two different cases are possible. In the first case, the curve C does 
not contain the point z = 0 within it. Then, after traveling the curve C, the 
argument of the point Zo returns to the original value arg2o = 9 o- Hence, the 
values of the functions wo(z ) and W\{z) are also equal to their original values 
at the point z = Zq after traveling the curve C. Thus, in this case, two distinct 
single- valued functions of the complex variable z are defined on C: 

Wo = r ^ e i9/2 and u)l = r l/2 e i/2(«+2,) i 

Obviously, if the domain D of the 2-plane has the property that any closed 
curve in the domain does not contain the point 2 = 0 , then two distinct single- 
valued continuous functions, wo(z) and wi(z), are defined in D. We call the 
functions Wq(z) and W\ (z) branches of the multivalued function w(z) = 2 1 / 2 . 

In the second case, the curve C contains the point 2 = 0 within it. Then, 
after traversing C in the positive direction, the value of the argument of the 
point 2 o does not return to the original value 9 q but changes by 27 t as expressed 

by 

arg 2 o = 9 0 + 2ir. 

Therefore, as a result of their continuous variation after traversing the curve 
C, the values of the functions Wq(z) and w\(z) at the point 2o are no longer 
be equal to the original values. More precisely, we obtain 

ultimo) = w 0 (z 0 )e™ and wi(z 0 ) = w 1 (z 0 )e l ' K , 
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which indicate that the function Wq(z) goes into the function W\(z) and vice 
versa. This recurrence phenomenon stems from the fact that z = 0 is the 
branch point of the multivalued function f(z) = z 1 ^ 2 . A formal definition of 
branch point is given below. 

4 Branch point: 

Suppose that several of branches of f(z) are analytic in the neighborhood 
of z = a but not at z = a. Then, the point z = a is a branch point if and 
only if f(z) passes from one of these branches to another when z moves 
along a closed circuit around z = a. 


I Remark. The point at infinity, 2 = oo, is a branch of f(z) if and only if the 
origin is a branch point of f(l/z). 

It is important to note that the branch points for a given multivalued function, 
always occur pairwise so that they are connected by a simple curve called the 
branch cut (cut or branch line). Branch cuts bound the regions within 
which the individual single-valued branches are defined. For instance, in the 
case of f(z) = z 1 ^ 2 , the branch cut ran from the branch point at z = 0 to 
another branch point at z = oo along the positive real axis. It should be 
emphasized here that any curve joining the origin (z = 0) and the point of 
infinity (z = oo) would have done just as well. For example, we could have 
used the negative real axis as the branch cut, for which the regions 

—7 r < (j> < 7r and it < tf> < 

(instead of 0 < <f> < 2 tt and 2tt < (f> < 47t) serve as the defining regions for 
the first and second branch. On the ic-plane, these two would correspond to 
Re v > 0 and v < 0, respectively. We therefore may choose the branch cut 
that is most convenient for the problem at hand. 

I Remark. The choice of branches and branch cuts for a given multivalued func- 
tion is not unique; however, the branch points and the number of branches 
are uniquely determined once a function is given. 


Exercises 

1 . Examine the multivaluedness of a logarithm function In 2 . 

Solution: Expressing z in polar form, In z = In (re*^ 1 ) = lnr+i^, 
and changing </> by 2nk results in 


In z(r, 4> + 2-irk) = lnr + i((f> + 2nk) = In z(r, <j>) + 2mk. (8.8) 




8.3 Analytic Continuation 245 


It follows from (8.8) that there is no nonzero value of k for which 
In z(r, (j) + 2irk) and In z(r,(p) are equal. Therefore, the logarithm 
function is an infinite- valued function. X 

2. Evaluate loge, log(— 1), log(l + i) according to the expression (8.8). 

Solution: log e = log \e\ + i arg e = 1 + 2niri, 

log(-l) = log | — 1| + iarg(-l) = (2 n + l)ni, 
log(l + i) = log |1 + i\ +i arg(l + i) = - + (2 n + \)n i. X 

3. Evaluate V and i l according to the definition of power functions: z a = 
e alogz , where z (yf 0) and a are complex numbers. 

Solution: 

^ i log 1 gi-2mri g 2mr 

ji _ gilogi _ gi(2n+i)-7Ti _ g(2n-|)7r • 

4. Show that a power function z m ^ n with an irreducible rational number 
m/n (n > 2) is an n-valued function. 

Solution: The multiple values of z(r,(j)) rn ^ n = r m l n e m< t>l n are 

found by varying the integer k in the expression: 

Z(r, $ + 27 Tk) m/n = r m/n e im4>/n e i2^km/n = ^ km/n ^ ^m/n _ 
Substituting k = n yields 

/ Tin \m/n i2nm ( ±\m/n / ,\m/n 

z(r,(p + 27rn) ' =e z(r,<p) ' = z(r,q>) ' , 

wherein e t2 ' Km = 1 for arbitrary m £ N. Hence, all multiple values 
of z m l n at a given z are found with a value of k in the range 
0 < k < n— 1. Since there are n different values of k in this range, 
z m/n j g an n . va i ue< i function. X 

8.3 Analytic Continuation 

8.3.1 Continuation by Taylor Series 

It is often the case that a complex function is defined only in a limited region 
in the complex plane. For instance, a series representation of a function is of 
use only within its radius of convergence, but provides no direct information 
about the function outside this radius of convergence. An illustrative example 
is a function f(z) defined by 

f(z) = l + z + z 2 + --- . (8.9) 

Obviously, this function is identified with 1/(1 — z) for \z\ < 1, whereas it 
diverges for \z\ > 1 and thus is no longer equivalent to 1/(1 — z). Nevertheless, 
a sophisticated technique makes it possible to identify the function f(z) given 
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in (8.9) with 1/(1 — z) even for the region \z\ > 1. This technique, by which 
the defined region of a function is extended to an ‘uncultivated’ region, is 
called analytic continuation. The resultant function may often be defined 
by sequential continuation over the entire complex plane without reference to 
the original region of definition. 

To see an actual process of analytic continuation, we suppose that a func- 
tion / is given as a power series around z = 0, with a radius of convergence 
R and a singular point of / being on the circle of convergence. We show 
that it is possible to extend the function outside R. We first note that at any 
point z = a within the circle (\z\ < R), we can evaluate not only the value of 
the series but all its derivatives at that point as well because the function / 
is analytic and the series representation has the same radius of convergence. 
Therefore, we can obtain a Taylor series of f(z) around z = a as 

~ fW( a ) 

= (8-io) 

n=0 

The radius of convergence of this series is the distance to the nearest singular 
point, say z = z s (see Fig. 8.3a). The resultant circle of convergence with 
radius Rq = \z s — Zq\ is indicated by the solid circle in the figure. One may 
setup this process using a new point, e.g., z = b, not necessarily within the 
original circle of convergence (see Fig. 8.3b), about which a new series such as 
(8.10) can be set up (see Fig. 8.3c). Continuing on in this way, it is apparently 
possible by means of such a series of overlapping circles to obtain values for 
/ for every point in the complex plane excluding the singular points. 

Our current discussion can be summarized as follows: 

1. Let f(z) be defined by its Taylor series expansion around z = a within 
some circle \z — a\ = r. 

2. Specify a certain point z = b within the circle and evaluate /(&), f'(b), ■ ■ ■ 
to obtain a Taylor series of f(z) around z = b. 

3. Observe that the latter series converges within a circle \z — b\ = r' that 
intersects the first circle but may contain a region that is not within the 
first circle. 

4. Specify again another point z = c within the circle \z — b\ = r' and repeat 
the process described above. 

8.3.2 Function Elements 

We know that the term ‘analytic continuation’ refers to a method that allows 
us to extend the defining region of a complex function. Alternatively, this term 
can refer to the function that is newly found through analytic continuation of 
some other function. The formal definition is given below. 
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Fig. 8.3. Illustration of an analytic continuation procedure 


4 Analytic continuation: 

Given a single- valued analytic function /i (z) defined on a region D \ , the 
analytic function f 2 (z) defined on D2 is called an analytic continuation 
of fi(z) to D 2 if and only if the intersection Di n D 2 contains a simply 
connected open region where fi(z) = 


If the two analytic functions fi(z) and /2(A) defined on D\ and D 2 , respec- 
tively, are analytic continuations of one another, then it is evident that an 
analytic function f(z) can be defined on D\ U D 2 by setting 


m 


fi(z) in D ll 
f 2 (z) in D 2 . 


Here, /1 and f 2 are called function elements of /. More generally, we can 
consider a sequence of function elements (/i,/2,--- , f n ) such that fk is an 
analytic continuation of fk— i- The elements of such a sequence are called 
analytic continuations of each other. Relevant terminology for this point 
is given below. 
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4 General analytic function: 

A general analytic function / is a nonvoid collection of function elements 
fk in which any two elements are analytic continuations of each other by 
way of a chain whose links are members of /. 

4 Complete analytic function: 

A complete analytic function / is a general analytic function that con- 
tains all the analytic continuations of any one of its elements. 


A complete analytic function is evidently maximal in the sense that it can- 
not be further extended. Moreover, it is clear that every function element be- 
longs to a unique complete analytic function. Incomplete general analytic 
functions are more arbitrary, and there are many cases in which two different 
collections of function elements should be regarded as defining the same func- 
tion. For instance, a single- valued function f(z) defined in D can be identified 
either with the collection that consists of the single function element defined 
on D or with the collection of all function elements defined on D' C D. 


Examples 1. Let us consider the functions 

OO 

h( z ) = Y z n defined on \z\ < 1 (8-11) 

n — 0 


and 


/2(~) = EU 


n = 0 


( 3\" +1 

{ 2^ 

n 

\ 

2 

(0 

( 3 + L 

1 denned on 

* + 3 


Both series converge to 1/(1 — z); Particularly the latter 


M z ) = ^Y 


n = 0 L 


Z \ Z+ ~ 


1- §(* + !) 


< l- (8-12) 

converges since 
1 

1-z' 


Therefore, the two functions represent the same function f(z) = 1/(1 — z) 
in the two overlapping regions (see Fig. 8.4), although they have different 
series representations. In this context, we can write 


m 


fi(z) when for z € D\, D\ = {z : \z\ < 1}, 

fi{z ) when for z G £> 2 , D 2 = {z : \z + § | < §}. 


2 . Another illustrative example is given by 


Hz)= r 

Jo 


t dt defined on R ez > 0 


(8.13) 
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Fig. 8.4. Both functions fi(z) in (8.11) and / 2 (a) in (8.12) represent the same 
function f(z) = 1/(1 — z) in the overlapping region D 1 n D 2 


and 

00 / 1 ■ \ n 

f2{z) = i ^ ( . j defined on \z + i\ < 1 . 

Observe that each /1 and f 2 reads 1 /z for the respective defining region. 
Thus, we have 

1 ( fi{z) for 2 £ Di, Di = {z : Re^; > 0 }, 

^ \f2{z) ior z e D 2 , D 2 = {z : \z + i\ < 1}. 

The two functions are analytic continuations of one another, and f(z) = 
1 /z is the analytic continuation of both /1 and f 2 for all z except z = 0 . 


Remark. In some cases, it is impossible to extend the function outside of a 
finite region because an infinite number of singularities are located densely on 
the boundary of the region. In that event, the boundary of this region is called 
the natural boundary of the function and the region within this boundary 
is called the region of the existence of the function. 
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8.3.3 Uniqueness Theorem 

Having introduced the concept of analytic continuation, we may ask a question 
as to whether the function resulting from an analytic continuation process 
is uniquely determined, independent of the continuing path; i.e. , whether a 
function that is continued along two different routes from one area to another 
will have the same value in the final area. We now attempt to answer this 
question by examining the theorem below. 

6 Uniqueness theorem: 

Let f-[ (z) and f 2 (z) be analytic within a region D. If the two functions 
coincide in the neighborhood of a point z € D, then they coincide through- 
out D. 


Proof The theorem to be proven is rewritten in the following statement: If 
both f(z) and g(z) are analytic at z o and if f(z n ) = g(z n ) with n = 1, 2, • • • at 
points z n that satisfy lim n _ >0o z n = zq but z n ^ zq for all n, then f{z) = g(z) 
throughout D. We now prove it. 

Let h(z) = f(z)—g(z). Here, / and g are assumed to satisfy the conditions 
given in the statement above, so that h(z n ) = 0 for all n and h(z) is analytic 
at zo- Owing to the analyticity of h(z) at Zq, we have the expansion 

h(z) = a 0 + ai(z — z 0 ) + a 2 (z — Zo) 2 + • ■ • , 

which converges in a certain circle around zq- Since h(z) is continuous at z o, 
we have 

h(z 0 ) = lim h(z n ) = 0, 

n— >■ oo 

which means that the coefficient ao is zero. Then, since h'(z) is also continuous 
at zo, we set 

ti(z 0 ) = lim h'(z n ) = 0, 

n—> oo 

which means that aq = 0. Continuing in this fashion, we find successively 
that all the coefficients vanish. In its circle of convergence, the function h(z) 
is therefore identically zero. This completes the proof. Jit 

This remarkable theorem demonstrates the strong correlation between the 
behaviors of analytic functions on different parts of the complex plane. For 
example, if two functions agree in value over a small arc (arbitrarily small as 
long as it is not a point), then they are identical in their common region of 
analyticity. 

8.3.4 Conservation of Functional Equations 

An important consequence of the uniqueness theorem is the so-called principle 
of the conservation of a functional equation. 
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4 Conservation of functional equations: 

Let F(p,q,r) be an analytic function for all values of the three vari- 
ables p,q,r, aud let f(z) aud g{z) be analytic functions of 2 . If a relation 
F[f(z),g(z), z] = 0 between function elements f(z) and g(z) holds on a do- 
main, then this relation is also true for all analytic continuations of these 
function elements. 


Remark. In plain words, this theorem states that analytic continuations of f(z) 
satisfy every functional (and differential) equation satisfied by the original 
/(*)■ 

This theorem can easily be generalized to cases of functional equations involv- 
ing more than two functions. We illustrate this by two examples. 

Examples 1 . From elementary trigonometry, we know that the real function 
sin x has the additional theorem 

sin (a: + u) = sin x cos u + cos x sin u, 

where u is an arbitrary real value. Since sin z, cos z, and sin (2 + u) are 
analytic for all finite values of z, and since the relation 

sin (z + u) = sin z cos u + cos z sin u 

is satisfied if z is any point on the real axis, it follows by analytic contin- 
uation that the same relation must hold for all values of z. If we report 
the same argument with respect to the real variable u, we find that u may 
be replaced by a complex variable w without invalidating the relation in 
question. Hence, the addition theorem of the function sin z is true for 
arbitrary complex values of 2 and w. 

2 . Another important example is afforded by functions satisfying differential 
equations. To take a simple case, we consider the function 

f(z) =log(l + z). 

This is represented for \z\ < 1 by the power series 

f{z) = z~Y + y--”- ( 8 - 14 ) 

which yields 

f(z) = l-z+z 2 -z 3 + ---=- 1 —. 

1 + 2 

In this context, the identity 
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m = ^ (8.i5) 

appears to be valid for \z\ < 1. However, it follows that the identity (8.15) 
must hold for all analytic continuations of the power series (8.14). 


8.3.5 Continuation Around a Branch Point 

The uniqueness theorem given in Sect. 8.3.3 also gives us the following corol- 
lary: 

6 Theorem: 

If D\ and D 2 are regions into which f(z) has been continued from D , 
yielding the corresponding functions /i and / 2 , and if D 3 = D\ ft D 2 also 
overlaps D, then /i = / 2 throughout D 3 . 


It is important to note that the validity of this theorem is due to the condition 
that D 3 and D have a common region. If this condition is not satisfied, the 
uniqueness of analytic continuation may break down. Instead, one can say: If 
analytic continuation of a function / along two different routes from zo to z\ 
yields two different values at Z\, then f(z) must have a certain kind of singu- 
larity between the two routes. This seems obvious by recalling the fact that 
the radius of convergence of a power series extends up to the next singularity 
of the function; if there were no singularities between the two routes, then it 
would be possible to fill in the region between the two routes by means of an- 
alytic continuation based on the power series. Then we would obtain sufficient 
overlapping so that the uniqueness theorem would be satisfied. In that event 
f(zi) for the two different routes would be identical, in contradiction to our 
hypothesis. There must therefore be a singularity between the two routes. 

Note that the last discussion does not state that different values must be 
obtained if there is any kind of singularity between the two routes. It must be 
a particular type of singularity to cause a discrepancy, and we call it a branch 
point, as we introduced earlier. An analytic function involving branch points 
is said to be multivalued and the various possible sets of values generated by 
the process of analytic continuation are known as branches. Intuitively, all 
the possible values of a function at a given point may be obtained by the 
process of analytic continuation if one winds about the branch point as many 
times as necessary. 


8.3.6 Natural Boundaries 

In all the examples considered so far, the singularities were isolated points. It 
is, however, easy to construct functions for which this is not the case. Consider, 
say, the function 
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/(2) sin(l/z)' 

The denominator vanishes for 1 fz = htt with an integer n. Hence, the points 
z = (l/mr) are singular points of f(z), but are clearly isolated in the vicinity 
of the origin. It is further possible for the singular points of a function to fill 
a whole arc of a continuous curve; in this case, we speak of a singular line 
of the function. 

Particularly interesting is a situation in which a function f(z) has a closed 
singular line C. In this case, it is obviously impossible to continue f(z) 
analytically across C. The entire domain of definition of f(z) is therefore 
the interior of C, and we say that C is a natural boundary of f(z). 

Such an occurrence is not as unusual as it may seem. Consider, for instance, 
the analytic function f(z) defined by the power series 

OO 

f(z) = z + z 2 + z 4 + z 8 + --- = J2 z2n - (8.16) 

n — 0 

By the root test given in Sect. 2.4.3, the circle of convergence of this series 
turns out to be \z\ < 1. Thus f{z ) must have at least one singularity on \z\ = 1. 
For the sake of simplicity, we assume that this singularity is situated at the 
point z = 1; a different location will cause a minor change in the argument. 
From the definition of f(z), it follows that 

OO 

/0 2 ) = z 2 + Z 4 + Z 8 -\ = Z 2 ” = f(z) - 2. 

n = 1 

By the principle of conservation (see Sect. 8.3.4), the functional equation 

f(z) = z + f(z 2 ) (8.17) 

is true for all analytic continuations of f(z). Observe that (8.17) gives 

f(z) = l + 2zf'(z 2 ), 

which means that f(z) cannot have a derivative at z = — 1 since from hypoth- 
esis /( 1) does not exist. Thus, z = — 1 is also a singular point of f(z). In the 
same way, from the relation 

f( z ) = z + f(z 2 ) = z + z 2 + f(z 4 ) 

it follows that the points 2 for which z 4 = 1 are singularities of f(z). 
Continuing in this fashion, we conclude that all points z for which z 2 =1 
are singularities of f(z). But these are the points e 2,r */( 2 ) that divide the 
circumference \z\ = 1 into 2" equal parts. Since, for n — > oo, all points on 
\z\ = 1 are limits of these points and since the limit point of singular points 
is also a singularity, it follows that all points on \z\ = 1 are singular points of 
f(z). We have thus proven that the unit circle is the natural boundary of the 
analytic function (8.16). 
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8.3.7 Technique of Analytic Continuations 

The uniqueness theorem is the fundamental theorem in the theory of analytic 
continuation. However, in practice, the most relevant method would be one 
that tells us whether a function / 2 is the analytic continuation of a function 

A- 

Let us describe two possible methods of analytic continuation: The first is 
based on the Schwarz principle of reflection, which essentially makes use 
of the functional relation f(z*) = f(z)*. 

X Schwarz principle of reflection: 

If f(z) is analytic within a region D intersected by the real axis and is 
real on the real axis, then we have f(z*) = f(z)*. 


Proof Expand f(z) in a Taylor series about a point a on the real axis. The 
coefficients of the Taylor series are real by virtue of the hypothesis that f(z) 
is real on the real axis. Hence, we have 

f( z ) = ~ a )"> ( 8 - 18 ) 


where c n is real. Then 

f( z Y ='52 c n{z* - a) n = f{z*), (8.19) 

n 


proving the theorem. X 

The above theorem holds for any point within the circle of convergence of 
the power series. By the methods of analytic continuation, therefore, it may 
be extended to include any nonsingular point conjugate to a point in D. As a 
result, the function in question can be continued from a region above the real 
axis to a region below. 

A second method employs explicit functional relations such as addition 
formulas or recurrence relations. A simple example is provided by the 
addition formula 

f(z + z i) = f(z)f(z i). 

If / were known only in a given region, it would be continued outside that 
region to any point given by the addition of the coordinates of any two points 
within the region. A less trivial example occurs in the theory of gamma 
functions. The gamma function is defined by the integral 
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/»00 

r(z) = / e-H^dt. (8.20) 

Jo 

This integral converges only for Re 2 > 0, so that it defines r(z) for only 
the right half of the complex plane. From (8.20), one may readily derive (by 
integrating by parts) a functional relationship between r(z) and r(z + 1): 

zr(z)=r(z+ 1 ). ( 8 . 21 ) 

We may now use (8.21) to continue T(z) into the Re z < 0 part of the complex 
plane. As first, we assume that -T(z) is known for x > 0. Then using recurrence 
relation (8.21), the points in the strip —1/2 < x < 1/2 can be computed in 
terms of the values of T(z) for x > 0. The function so defined and the original 
function have an overlapping region of convergence so that it is the analytic 
continuation into the negative a;-region. 


8.3.8 The Method of Moment 


Suppose that we are given a power series /(z) = a nZ n where the co- 

efficients a n are the moments of a given continuous function. For example, 
suppose that there exists a continuous function g on [0, 1] such that 

a n = f g(t)t n dt. 

Jo 


Then 


/(*-) = £ 


oo r ri 


n — 0 


g(t)t n dt 


= E 

n = 0 


g(t)(zt) n dt 


and interchanging the order of summation and integration, we find that 

OO 

J2 9 ^ zt y 


m = f 

Jo 


.n— 0 


dt= f 


1 — zt 


(The interchange of summation and integration is easy to justify if \z\ < 1.) 
Moreover, this integral form serves to define an analytic extension of the 
original power series. 


Examples Consider 


Since 


m = 


n — 0 


Z 

n + 1 


(M < i). 


i 

n + 1 



t n dt , 


(8.22) 
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we set g(t) = 1 to obtain 

m = f for |*| < 1. 

The integral above is the analytic continuation of the original representation 
(8.22), so that the latter is analytic throughout the complex plane except for 
the semi-infinite line [l,oo). [In fact, the analytic continuation has a discon- 
tinuity at every point of the interval [1, oo).] 


Exercises 


1. Suppose f(z) = Y CkZ Uk with liminf fc+1 > 1. Prove that the circle of 

‘ ^ k — »oo Tib 

k—0 K 

convergence of f(z) above is a natural boundary for /. 


Solution: Since the result is independent of c*,, we may assume 

without loss of generality that the radius of convergence is 1. In 
addition, neglecting finitely many terms if necessary, we assume 
that for some 6 > 0 and for all k, nk+i/nk = 1 + S. Finally, 
it suffices to show that / is singular at the point z = 1. The 
same result applied to the series c k{ze~ l6 ) nk shows that / 

is singular at any point 2 = e tS . 

Choose an integer m > 0 such that (to + l)/m < 1 + <5 and 
consider the power series g(w) obtained by setting z = ( w m + 
ui m+ i)/2. We then find that 


g(w) 


j ^w m + w m+1 ^ 

C ° zu mn ° 4- C ° n ° w mn n +l , . . i C ° ^mno+nn 
2 n o 2 n ° 2 n ° 

+ TT w mn 1 + w mni + l + . . . + + . . . . 

2ni 2 n i 2 ni 


Note that in this expression no two terms involve the same power 
of w, since the inequality mrik+i > mrik + Uk holds whenever 
nk i/n/; > (m + l)/m. If |w| < 1, then (|w| m + |w| m+1 )/2 < 1, 
and since f(z ) is absolutely convergent for \z\ < 1, the series 
SfcLo l Cfe |[(M m + |w| m+1 )/2]" fc converges. Hence, for \w\ < 1, 
g{w) is absolutely convergent. On the other hand, if we take w 
real and greater than 1 , then ( w m + w m+1 )/2 > 1 , so the series 
YlkLo c k[(w m + w m+1 )/2] nk diverges. Note, though, that the jth 
partial sums Sj of the above series are exactly the + l)th 
partial sums of the power series of g. Hence, the series for g(w) 
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diverges and g , too, has a radius of convergence of 1. This means 
that g(w) must have a singularity at some point Wo with |ti?o| = 1. 
If wq ^ 1, then \(w m + w m+1 )/ 2| < 1 and since / is analytic in 
\z\ < 1, g is analytic at wq- Thus g must have a singularity at 
wq = 1 and since g(w) = f[{w m + w m+1 )/2], f(z ) must have a 
singularity at z = 1 . X 


2. Define an analytic continuation of: (i) — =, (ii) 

n=l 


n — 0 


n 2 + 1 


Solution: 


i r 

(i) Since — r-^ = T(l/3) / e~ nt t~ 2 ^dt, we have 
n 173 Jo 

°° /i\ roo °° 

= E i YMT**'* 

n — 1 N 7 n=l 


= r, 3 


dt, 


I t 2 / 3 (e* — 2 :) 
which is analytic outside of the interval [l,oo). 


(ii) Since 2 


1 

?r 2 + 1 


e "'sintdi, 


E 

n—0 


n 2 + 1 


^ (ze 4 )"sin tdt = 


e* sin t 


dt , 


n—0 


e 1 ' — z 


which is analytic outside of the interval [l,oo). Jk 


3. Suppose that / is bounded and analytic in Imz > 0 and real on the real 
axis. Prove that / is constant. 

Solution: By the Schwarz reflection principle, / can be extended 
to the entire plane and would then be a bounded entire function. 
Hence, / is constant. X 


4. Given an entire function that is real on the real axis and imaginary on 
the imaginary axis, prove that it is an odd function; i.e., f(z) = — /(— z). 
Solution: Set f(z) = f(x,y) = u(x + iy) + iv{x + iy). The 

Schwarz reflection principle implies that f(z*) = f(x — iy) = 
u{x — iy)+iv(x — iy) = u(x + iy) —iv(x + iy) = — f(z ). In a similar 
way, we have f{—z) = f(—x — iy) = u{—x — iy) + iv{—x — iy) = 

— u{x + iy) — iv{x + iy) = —f(z). X 
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Contour Integrals 


Abstract In this chapter, we show that singularities do not interfere with the anal- 
ysis of complex functions but are useful in extracting complex integrals along closed 
contours. This utility of singularities is based on the residue theorem (Sect. 9.1.1), 
argument principle (Sect. 9.4), and principal value integrals (Sect. 9.5.1), all of which 
correlate the nature of singularities within and/or on the contour with the relevant 
complex integrals. 


9.1 Calculus of Residues 

9.1.1 Residue Theorem 

In the preceding two chapters, we provided the theoretical bases of complex 
functions. This chapter deals with more practical matters that are relevant to 
computations of contour integrations on a complex plane. The theorem below 
is central to the development of this topic. 

4 Residue theorem: 

If a function f(z) is analytic everywhere within a closed contour C except 
at a finite number of poles, its contour integral along C yields 

f(z)dz = 27riy^Res(/, aj). (9.1) 

i 

Here, Res (/, a,j ) is called the residue of f(z) at the pole z = a,j. When the 
pole is mth order, it reads 

1 Am-l) 

Res(/ ’ aj) = (m^l)! }% dz^) [( " " ajrf{z)] ■ (R2) 
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Once the residue is evaluated, the integral <f c f(z)dz around the contour C 
surrounding the pole z = a can be determined by the above theorem. Notably, 
this theorem enables us to evaluate various kinds of integrals of real functions 
that are unfeasible by means of elementary calculus. 

Before demonstrating the utility of the residue theorem, we present a short 
review of the nature of residues. Originally, the residue of f(z) is defined in 
association with a particular coefficient of the Laurent series expansion. We 
know that f(z) around its pole at z = a may be expressed by a Laurent series 
expansion such as 


OO 

f(a + h)= ^2 c i 

n =— oo 


l 

2lti 


(z 


f(z) 

- a) n+1 


dz. 


Then, the specific coefficient 


C-l = 


1 

27 ri 


f{z)dz 


(9.3) 


is called the residue of f(z) at z = a. In fact, the result (9.3) immediately 
reduces to the form of (9.1) as 


j> f(z)dz = 2nic-i. 

The equivalence of the two quantities, Res (/, a) in (9.2) and c_i in (9.3), is 
verified as follows. 


Proof (of the residue theorem). Suppose that f(z) has a pole of order m at a. 
Then f(z) can be written as 


Hz) 


C—m C-m - (-1 

(z — a) m (z — a) m ~ 1 


Now we introduce the quantity 


OO 

77ZTi + E^-«)" ( 9 - 4 > 

' ’ n—0 


g(z) = {z- a) m f(z) 


= C- m + C- m+1 (z - as) H 


= y jCn . m (z- a) n . (9.5) 

n—0 

Since g(z) is analytic everywhere in a neighborhood around a, it can be 
expanded in terms of a Taylor series as 


. . 5 (n) («) / ,. n 

9{z) = Y,^t L ( z ~ a ) ■ 


n—0 


(9.6) 


The residue c_i is the coefficient of the n = m — 1 term in (9.5). Hence, 
comparing (9.5) with (9.6), we have 
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C-i = 


1 

(to — 1)! 


( m -D(a) = 




(9.7) 


which is simply equation (9.2). X 


9.1.2 Remarks on Residues 

The reason that only the particular coefficient c_i plays a role in evaluating 
the contour integral is clarified by integraing both sides of (9.4) along the 
contour containing the mth-order pole a. For convenience, we rewrite (9.4) as 

m , . 

= + (9.8) 

' (z — a) n 

n= 1 v 7 

where 

OO 

'Pa(z) = ^2 c n (a)(z - a)" 

n = 0 

is the regular part of the series (9.8), thus being analytic everywhere in a 
region within a closed contour C containing a. By integrating f(z) along the 
contour C, we set 


f{z)dz = ^2 C- r 


71 = 1 


c 


(z - a) 


-dz 


(9.9) 


because of the analyticity of ft a (z). The integral of (9.9) can be easily eval- 
uated by letting the contour be a circle of radius p centered at a. Since any 
point on the contour can be expressed as z = a + pe 1 ^ , we have 


C 


{z - a) 


-dz = 


f 27r ipe* 




n2n 


d(f> = ip 


0 —i(n— 


1 ^#. (9.10) 


Note that the integral (9.10) vanishes for all n / 1, and it is only when n = 1 
that it has a nonzero value: 


1 


r 00+277 


-dz = i 


d(j> = 2iri. 


c 


Therefore, all the terms in the sum of (9.9) are zero except the n = 1 term, 
and Goursat’s formula takes the form 


f(z)dz = 2 tt i c_i . 


(9.11) 


In short, once we integrate the function f(z) in (9.8), only the term involving 
c_i survives, whereas the other terms vanish. This results in the fact that the 
contour integral § c f(z)dz around a pole is determined by the value of the 
specific coefficient c_i. 
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9.1.3 Winding Number 


To evaluate <f { , f{z)dz when C is a general closed curve (and when / may 
have isolated singlarities), we introduce the following concept: 


4 Winding number: 

Suppose that C is a closed curve and that the point z = a is not located 
on C. Then the number 


n(C, a) 



is called the winding number of C around a. 


Note that if C represents the boundary of a circle (traversed counterclockwise), 
then the winding number reads 


n(C, a) 


0 if a is inside the circle, 

1 if a is outside the circle. 


Both identities have already been proven in the context of Cauchy’s theorem. 
In addition, if the curve C encloses k times the point a, then we have 


r2kTT 


n(C , a) = 


27 ri 


idd = k, 


which explains the terminology “winding number.” 

6 Theorem: 

For any closed curve C and point a £ C, the winding number n(C, a) is 
an integer. 


Proof Suppose that C is parametrized by z(t), 0 < t < 1, and set 

rs z'{t) 


f(s) = [ 

Jo 


Then, it follows from 


that the quantity 


o z(t) - a 


f(s) = 


dt (0 < s < 1). 


3 '( S ) 


z(s) — a 


[z(s) - a]e" /(s) 
is a constant, and setting s = 0, we have 

[z(s) — a\e~^ s ' > = z(0) — a. 
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Hence, 


e f(s) = Z ( S ) - Q 

2 ( 0 ) — a 


and e-^ 1 ) = 


2 ( 1 ) — a 
2 ( 0 ) — a 


since C is closed, i.e., 2 ( 1 ) = 2 ( 0 ). Thus 


/( 1) = 2irki for some integer k 


and 

n( c ,a)= ^- f (l) = k. * 

In terms of the winding number, the residue theorem given in Sect. 9.1.1 
can be restated as follows: 


4 Residue theorem (restated): 

Suppose / ( 2 ) is analytic in a simply connected domain D except for 
isolated singularities at Zi,Z 2 , — ‘ > z m- Let C be a closed curve that does 
not intersect any of the singularities. Then 

n(C,z k )Res(f, z k ). (9.12) 



The proof is left to the reader. 


9.1.4 Ratio Method 

We saw in Sect. 7.4.5 that a function having a pole of order to can be expressed 
by the ratio of two polynomials such as 

In this case, it is possible to formulate an alternative equation that determines 
the residue of f(z). Employing such an equation to evaluate the residue is 
referred to as a ratio method. 

To derive these equations, we first recall the fact that if a function R(z) 
satisfies 


p(a) =p'(a) = ■■■ = p {m ~ 1) (a) =0 and p (m) (a)^ 0, 
the Taylor series for R(z) is given by 

p( m \a) . 

p{z) = j -{z-a) + h.o., 

777, 
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where h.o. means the terms of higher order. Such a function, for which the 
lowest power of (z — a) is to, is said to have an mth-order zero at a. 

Now we present the equation for the residue of f(z) at a simple pole a. 
As seen from (9.13), a simple pole of f(z) arises from the fact that p(z) has a 
zero of (to. — l)th order and q(z) has a zero of order m. Then, 


/(*) 


p(m !)( a ) 

(to — 1)! 
q (m \a) 

l \ Z 

777 , 


a) m ~ 1 + h.o. 


a) m + h.o. 


For such a function, we obtain the residue of / at the simple pole a as 

pi™- 1 ) (a) 


c_i = lim (z — a)f(z) = m- 


g( m )(a) 


(9.14) 


By means of 9.14, we can compute the residue of f(z) at a simple pole a quite 
easily. 

Next we consider the equation for a second-order pole of f(z) at a. Such 
a pole arises when p(z) has a zero of order to and q(z) has a zero of order 
(to. + 2) at a. Then, 


- «r + - «r +1 + h.o. 


/m = 


d 


(to + 1)! 


q ^§(z - <,)”« + - «)™+ 3 + h.o. ' 

(m + 2)! v ’ (to + 3)! 


from which we set 


to + 2 (to + 3)p^ m+1 ^ (a)g^ m+2 ^ (a) — (to + l)p( m ) (a)g^ m+3 ^ (a) 
to + 3 [g( m+2 )(a)] 2 

For example, if the second-order pole of a function arises from a second-order 
zero of q(z), then m = 0. The residue of such a pole is given by (9.15) as 

2 3 p'(a)q"(a)-p(a) q W(a) 

1 ~ 3 [q"(a)) 2 ' 1 j 


9.1.5 Evaluating the Residues 

In what follows, we demonstrate actual procedures to evaluate the residue 
by means of the three methods discussed in the previous subsections. As an 
instructive example, we consider the function 
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2(2 + 2) 2 ’ 

which has a simple pole at z = 0 and a second-order pole at z = — 2. 


Using a Laurent expansion: 

The present purpose is to evaluate the coefficient c_i of the Laurent series 
expansion of f(z) around the poles at z = 0 and 2 — — 2. In order to do this 
we first determine the Taylor series for the factor e z /{z + 2) 2 around 2 = 0. 
Since the expressions 


and 


(z + 2 )" 


e ' = 1+z+ * 


1 + ( 2 / 2 ) 


-1 <-> 9 

1 - 2 + - 2 2 

4 


hold around 2 = 0, we have 

e z 1 / _ 2 2 

2 ( 2 + 2) 2 “ 4? V + “ + 2! ' 

Thus, we immediately obtain 


1-2 


-2 2 - 


1 2 
4z + 16 




Similarly we have 


c_i(— 2) = — -e " (see Exercise 1). 


Using Goursat’s formula: 

The residue of the simple pole at 2 = 0 is given by 

c_i(0) = lim 

z—>0 


1 

4 


2(2 + 2) 2 _ 

and that of the second-order pole at 2 = —2 is given by 

e z 


. 1 d 

= JT * 


(2 + 2) 


2 ( 2 + 2) 2 


4 6 


-2 
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Using the ratio method: 

For this example, the numerator and denominator functions can be chosen in 
different ways. For the residue at z = 0, we could take 


p(z) = e z , q(z) = z(z + 2f 


or, alternatively, 


p{z) = 


r, <l(z) = z- 


(z + 2) 2 ' 

For either choice, the residue for the simple pole is given by 

J>(0) 1 

0 = 7(o) = i' 

The residue c_i(— 2) can be obtained in a similar manner as above (see 
Exercise 2). 


Exercises 

1. Evaluate the residue of 


/(*) = 


z(z + 2) 2 

at z = —2 by using a Laurent expansion. 

Solution: The residue of f(z) at z = —2 is found by using the 

expression 


e* = e“V + 2 


= e~ 2 f; (z+ | 2) " =e ~ 2 


n — 0 


1 + (z + 2) + 


(z + 2) 2 


2 ! 


and the Taylor series expansion for 1/z around z = —2 as 


1 _ 1 
z ~~2 


1 


1 — {z + 2)/2 J 


= -E 


m = 0 


(z + 2 y 

2m+l 


1 z + 2 (z + 2) 2 
"2 4 8 


Thus, the Laurent series for f(z ) around z = —2 is 


e z _ i 

z(z + 2)2 “ ^2 e 

from which we have 


-2 


1 


3 1 


(z + 2) 2 2 z + 2 4 


c_ 1 (-2) = --e- 2 . X 
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2. Evaluate the residue of 


/(*) = 


z(z + 2) 2 

at z = — 2 using the ratio method. 

Solution: For the pole at z = —2, we can choose either 
p(z) = e z , q(z) = z(z + 2) 2 

as before or 

p(z) = —, q(z) = (z + 2 ) 2 . 

z 

Then, regardless of how the numerator and denominator are cho- 
sen, we refer to (9.16) to obtain 


c_i(— 2) = 


2 3p'(-2)q"(-2) - p(-2)qW(-2) 3 _ 


[q"(~ 2)] 2 


e . X 


9.2 Applications to Real Integrals 

9.2.1 Classification of Evaluable Real Integrals 

Using the residue theorem, we can evaluate the five types of real integrals 
listed below. 

p2ir 

1. / f (cos 0, sin 9) dO, where f(x, y) is a rational function without a pole 

J o 

on the circle x 2 + y 2 = 1. 

/ OO 

f(x)dx, where f(z) is a rational function without a real pole and 

-OO 

is subject to the condition that lim xf(x) = 0. 

|x|— >-oo 

/ OO 

f(x)e %x dx, where f(z) is an analytic function in the upper-half 

-OO 

plane Im z > 0 except at a finite number of points. 

/»00 

4. / /( x) /x a dx, where a denotes a real number such that 0 < a < 1 and 

Jo 

f(z ) is a rational function with no pole on the positive real axis x > 0, 
which satisfies the condition f(z)/z a ~ 1 — > 0 as z — > 0 and z — > oo. 

/»00 

5. / f(x) logxdx, where f(z ) is a rational function with no pole on the 

Jo 

positive real axis x > 0 and satisfies the condition lim xf(x) = 0. 

x — »+oo 
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In Sect. 9. 2. 2-9. 2. 6 we demonstrate actual processes for evaluating the 
above integrals. 


9.2.2 Type 1: Integrals of /(cos 6, sin 6) 


Consider an integral of the form 

/ f(cos9, sin 6)d6. 

J o 

Setting z = e lS makes it a contour integral around the unit circle, and thus 
the evaluation of the residues within the circle completes the integration. 

Example We evaluate the integral 


1 J 0 1 — 2p cos 9 +p 2 (p<1) - (9 - 17) 

If we express cos 9 in terms of z = e l6 , 9.17 becomes a contour integral, 

f 1 dz 1 / dz 

= f c l-p( z + z -i)+p^ = if c (l-pz)( Z -p)' { > 

where C is a unit circle centered at the origin. The integrand in (9.18) has a 
simple (first-order) pole at z = p within C . Hence, we obtain 

r 1 „ / 1 \ 2 tt 

I = - x 2m lim = . ft 

l \ 1 — pz ) 1 — P A 


9.2.3 Type 2: Integrals of Rational Function 


Next consider the integral 


f(x)da 


where f(x) is a rational function subject to the condition 

lim xf{x) = 0, 

\x \— >oo 

which is a necessary and sufficient condition for the integral to be convergent. 
To evaluate (9.19), we consider the integral of f(z) along a closed contour 
consisting of the real axis from —R to +R and a semicircle r(R) in the upper 
half-plane. The contour integral is expressed as 


f{z)dz = [ f(x)dx + [ f{z)dz 
Jc J-R Jr(R) 
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From the Lemma below, it follows that the second term in (9.20) vanishes in 
the limit R — > oo. Hence, we obtain 


lim (f f(z)dz = [ f{x)da 
R ~^°° J c J_ 


(9.21) 


and applying the residue theorem yields 

/ OO 

f{x)dx = 27riy^Res(/, aj), 

-OO 


where aj is the jth pole of /(z) in the upper half-plane. Therefore, the evalu- 
ation of the residues located within the upper half-plane completes the inte- 
gration. 


Example We prove the equation 


/ = 



dx 

1 + x 2 


= 7T. 


Since x/{\ + x 2 ) vanishes as |a:| — > oo, we may follow a process similar to the 
one discussed above. Since z = i is the only pole of l/(l + z 2 ) = 1 /(z + i)(z — i) 
involved in the upper half-plane, we have 

/ = (27ri) • Res(i) = 27n^: = it. 

Less simple examples will be found in Exercises Sect. 9.2. Jjt 

As was noted earlier, our result (9.21) is based on the following lemma: 


4 Lemma: 

Let f(z) be continuous in the sector < arg z < 62 - If 


lim zf(z) = 0 for 9\ < arg z < 62 , (9.22) 

\z\—>oo 

then the integral J f(z)dz extended over the arc of the circle \z\ = r con- 
tained in the sector tends to 0 as r — > 00 . 


Proof Let M(r ) be the upper bound of \f{z)\ on the arc of the circle \z\ = r. 
Then we have 


f(z)dz <M(r) / rd9 = M(r) ■ r (02 — 0i). 

Jo 1 


(9.23) 


In view of the condition (9.22), the right-hand side of (9.23) vanishes as r 
00 . This completes the proof. £ 
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9.2.4 Type 3: Integrals of f(x)e lx 


We now study integrals of the form 


f(x)e lx dx, 


where / is analytic on the upper half-plane Im 2 > 0 except at a finite number 
of singularities (if they exist) . We first consider the case when the singularities 
are not on the real axis. Then, the integral 


f(x)e lx dx 


l-R 


has a meaning, which can be seen from the following theorem: 


4 Theorem: 

If limpi^oo f(z ) = 0 for Im 2 > 0, then 


r R 

lim J f(x)e zx dx = 2iri^^R.es \f(z)e lx ] 


the summation extending over the singularities of f{z) contained in the 
upper half-plane y > 0. 


Before starting the proof, we note that \e lz \ < 1 in the half-plane y > 0. This 
leads us to integrate on the half-plane y > 0 along the contour used above for 
an integral of type 2. To prove the theorem, thus it suffices to show that the 
integral f r ^ f{z)e lz dz tends to 0 as r tends to oo. 

If we know in advance that limp^^ 2 /( 2 ) = 0, then it would be sufficient 
to apply the lemma in Sect. 9.2.3. To prove that j r(Ri }{z)e lz dz tends to 0 
with only the hypothesis of the theorem above, we use the following lemma: 


4 Jordan Lemma: 

Let f(z) be a function defined in a sector of the half-plane y > 0. If 
limui^oo f(z) = 0, the integral J f(z)e lz dz extended over the arc of the 
circle \z\ = r contained in the sector tends to 0 as r tends to oo. 


Proof Let us put 2 = re 1 6 and let M(r) be the upper bound of \f(re l6 )\ as 9 
varies, the point e ie remaining in the sector. Then, 


f{z)e iz dz 


< M(r ) 


9 r0 = 2 M(r) 


rV 2 


\e. 


(9.24) 
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Since 


we have 



< 1 for 0 < 9 < — , 
~ ~ 2 ’ 


/»7t/ 2 />7 r/2 /»oo 

e— rsin V< / e~^ re rO< / e~= r6 


Jo Jo 

From (9.24) and (9.25), it follows that 


r6 = l- 


f(z)e iz dz 


< 7T M(r). 


(9.25) 


(9.26) 


In view of our assumption lim^i^oo f(z) = 0, the right-hand side of (9.26) 
vanishes as r — > oo, which completes the proof. X 


Remark. 

1. If we have to calculate an integral 



f(x)e lx dx 


that involves a negative imaginary exponential e ~ zx , it would be necessary 
to integrate in the lower half-plane instead of the upper one because the 
function |e _l2 | is bounded in the lower half-plane y < 0. More generally, 
an integral of the form f_ f(x)e ax dx (where a is complex constant) can 
be evaluated by integrating in the half-plane where |e oz | < 1. 

2. Remember that sin 2 and cos 2 are not bounded in any half-plane. To 
evaluate integrals of the form 

/ OO /»oo 

f(x) sin™ xdx and / f(x) cos” xdx, 

-OO J — OO 

we always express the trigonometric functions in terms of complex expo- 
nentials so that the preceding methods can be applied. 


9.2.5 Type 4: Integrals of f(x)/x a 

Consider integrals of the form 



f(x) 

x a 


dx, 


where a denotes a real number such that 0 < a < 1, and f(x) is a rational 
function with no pole on the positive real axis x > 0. In addition, we assume 
f(z ) such that f(z)/z a ~ 1 — > 0 in the limits 2 — > 0 and 2 — > 00 . 
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To calculate such an integral, we consider the function 


9(z) 


/(*) 


z 


a 


of the complex variable z, defined in the plane with the positive real axis 
x > 0 excluded. Let D be the open set thus defined. It is necessary to specify 
the branch of z a chosen in D, so we take the branch of the argument of 2 
between 0 and 27 t. With this convention, we integrate g(z) along the closed 
path C(r, e) as follows: we first trace the real axis from e > 0 to r > 0, then 
the circle T(r) of centered at the origin and radius r in the positive sense, 
then the real axis from r to e, and finally, the circle 7(e) of center 0 and radius 
e in the negative sense. The integral 


' C(r,e) 


M 

z a 


dz 


is equal to the sum of the residues of the poles of f(z)/z a contained in D if 
r has been chosen sufficiently large and e sufficiently small. We have 


/(*) 


dz = 


m 


dz - 


' C(r,e ) 


lr(r) 


'7(e) 


m 

z a 


dz + (l 


g— 2niot 



f(x) 

x a 


dx 


because when the argument of z is equal to 27T, 


2 a = e 2*ia\ z |a 


From assumption, j(z)lz a ~ x tends to 0 when z tends to 0 or when \z\ tends 
to infinity. Thus the integrals along r(r) and 7(e) tend to 0 as r — » 00 and 
£ — > 0. On the limit, we have 


(1 


g— 2iricx. 


r M dx = 2 7Tiy^Res 

Jo x a ^ 


'M 

z a 


(9.27) 


This relation allows us to calculate the original integral. 
Example Try to evaluate the integral 


/= r <** 

Jo X a (l+x) 


(0 < a < 1). 


Here we have 


m 


1 

1 + z’ 


where there is only one pole at z = — 1. As the branch of the argument of 2 is 
equal to n at this point, the residue of f(z)/z a at this pole is equal to l/e 7ria . 
Relation (9.27) then gives 


sm 7ra 
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9.2.6 Type 5: Integrals of f(x) logo; 

The final type of integral to be noted is a class of the form 

f(x) log xdx, 

where / is a rational function with no pole on the positive real axis x > 0 and 

lim xf( x) = 0. 

x — »oo 

This last condition ensures that the integral is convergent. 

We consider the same open set D as for integrals of Type 4 and the same 
path of integration. Here again, we must specify the branch chosen for log z, 
and we choose the argument of 2 between 0 and 2i r. For a reason that will 
soon be apparent, we integrate the function f(z )( log z ) 2 instead of f(z) log 2. 
Here again the integrals along the circles -T(r) and 7(e) tend to 0 as r — > 00 
and s — > 0, respectively. 

When the argument 2 is equal to 27r, we have 

log 2 = log x + 2 tt i. 



Thus we have the relation 

/•OO /»oo 

/ f(x)(\ogx) 2 dx — / f(x) (log x + 2ni) 2 dx = 2Tri'S^Res [f(z) (log z) 2 ] . 
Jo Jo 

and, hence, 

/»oo /»oo 

—2 / f(x) logxdx — 2m / f(x)dx = y^Res [/(z)(log z) 2 ] . (9.28) 

Jo Jo 

By taking the imaginary part of the relation (9.28), we obtain the desired 
result: 

f f(x)logxdx = -^Im j^Res [/(z)(log z) 2 ] | . 


Example Consider the integral 

/ = 


log a: 
(l + x) : 


:dx. 


As the residue of (logz) 2 /(l + z) 3 at the pole 2 = — 1 is equal to 1 — in r, we 
find 
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Exercises 


c27T 


1. Evaluate the integral defined by 7 = 


dd 


' 0 (1 — a cos 6) 

Solution: Let z = e lS and set C : \z\ = 1. Then 


2 (0 < a < 1). 


7=4 


zdz 


la Jc [z 2 + (2 z/a) + 1] 


2 ' 


The integrand has two poles of second order at z = Zi, Z 2 (|zi| < 
|~ 2 1) 5 which are the solutions of the equation g(z) = z 2 + ( 2z/a ) + 
1 = 0. Since 0 < a < 1, only the pole z\ = (—1 + a/1 — a 2 ) /a is 
found within C. The residue at z\ is given by 


Resf^i) = lim — 

Z — 1 -Zi dz 


(z-Zi ) 2 


z 

gW_ 


lim 


zi dz (z — Z 2 ) 


2 


zi + Z 2 _ 2/a 

(Z 1 -Z 2) 3 (2 v / r ^2/a) 3 ’ 


and thus we obtain 


7 = T-o x 27riRes(zi) = 


27T 


(1 — a 2 ) 3/2 


2. Evaluate the integral 7 = 


1 

27 ri J c z 


-dz ( C : \z\ = 1) for integer n. 


Solution: For integers n < 0, it is apparent that 7 = 0 since 

the integrand is analytic within and on C. For integers n > 0, 
f(z) = e z z~ n has a pole of order n at z = 0. Using the residue 
theorem, we have 7 = l/(n — 1)!. A 

r°° dx 

1-00 (1 + X 2 ) n+1 

Solution: Define the function f(z) = 1/(1 + z 2 ) n+1 , and set the 

semicircle C as shown in Fig. 9.1. Within C, f(z) has the pole of 
(n + l)tlr order at z = i, and its residue reads 


3. Calculate the integral 7 = 


1 

n! 


(z-i) 


n+l I 


dz n (1 + z 2 ) n+1 


1 

n! 


dz' 


;{z + i)~( n + 1) 


= (-l)"(n+l)(n + 2)---2n (2n+1) 


n\ 


(2n)! 1 

2 2n (n!) 2 2i 


Hence, in view of Cauchy’s theorem, we have 



9.2 Applications to Real Integrals 275 


f(z)dz = 


7t(2 n)\ 

2 2n (n!) 2 ' 


(9.29) 


We now observe that 


£ f{z)dz = j 


dx 


dz 


_ R (1 + x 2 ) n+1 J r {l + z 2 ) n+1 ' 


(9.30) 


where r denotes the upper half-circle. Since |1 + z 2 \ > R 2 — 1 on 
C, the second integral in the limit R — » oo yields 


dz 


J r (1 + z 2 ) n+1 


< 


7 tR 


{R 2 - 1)"+! 


(9.31) 


From (9.29)-(9.31), we conclude that 


7r(2n)! 

2 2n (n!) 2 ' 


* 



4. Calculate the integral I = / log(l — 2rcos0 + r 2 )dd , where r/ 1. 

Jo 

Solution: First we assume that r < 1. Observe that the function 

log(l — z)/z = — 1 — (z/2) — (z 2 /3) is analytic for \z\ < r < 1. 

Hence, if we set the circle C : \z\ = r, we have 

(f dz = i f log(l — z)d6 = 0. (9.32) 

Jc z Jo 

Since |1 — z\ 2 = 1—2?’ cos 6+r 2 on C, the real component of the sec- 
ond integral in (9.32) reads (i/2) f ^ log (l — 2r cos 9 + r 2 ) dO = 0, 
so we get 
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7 = 0 for r < 1. 

Next we consider the case of r > 1. Set s = l/r<lto obtain 


0 = J log (l — 2s cos 9 + s 2 ) dd = J log ^ 

= f [log (l - 2r cos 9 + r 2 ) - logr 2 ] dd. 

Jo 

Hence, we conclude that 

/ = 27rlogr for r > 1. X 


1 — - cos 9 + ) d9 


5. Calculate the integral I — 


r°o x a — 1 


dx, where 0 < a < 1. 


1 + x 

Solution: Consider the power function 

Z P _ e /31°gz _ e /5(l°g |z|+iargz) 

with — 1 < /3 < 0. Its branch for 0 < argz < 27t is single-valued on 
the domain D enclosed by the contour C = AB + T + B'A' + 7 
depicted in Fig. 9.2. Let the radius r of the circle 7 be sufficiently 
small and that R of F be sufficiently large. Then, the pole z = —1 
of the function f(z) = z l3 /( 1 + z ) is located within C so that we 
have 


C 


f(z)dz = 


<- R X? 


1 , + / f(z)dz - 

1 + z J r 


rR x^e 13 ' 2 ™ 
r 1 + X 


f(z)dz. 


(9.33) 


Observe that 
^ f(z)dz 

and 


< 


Mz\ < 


2ttR> 3+1 


p 1 1 + z\ R — 1 

27 rr /3+1 


0 (R — : > 00 ) 


f{z)dz 


< 


1 — r 


0 (r — > 0). 


Take the limits R — » 00 and r ^ 0 on both sides of (9.33) to yield 


(1-e /3 ' 2wi ) J 


00 X? 


/ 0 l + x 
which then gives us 


dx = Res 


1 + z 


,-l 


= 27ti lim z 33 — 2irie 

z — * — 1 


r°° X P 


dx = 2 tt i- 


0 /3-ni 


j 0 l + x "A — eP' 2vi 

Since /? = « — !, the above result is equivalent to 
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Fig. 9.2. Integration path C = AB + F + B'A' + 7 used in Exercise 5 


9.3 More Applications of Residue Calculus 


9.3.1 Integrals on Rectangular Contours 


The integrals discussed so far are evaluated using the residue theorem based 
on a circular (or semicircular) contour whose radius is eventually made to be 
infinitely large or infinitely small. However, there are other integrals that can 
be evaluated by the residue theorem that do not have to be closed with a 
circle. Several examples are given below. 

Let us consider the integral 

1= [ — — j dx . 

J- 00 (1 + e 2x ) 2 

To evaluate it, we examine the contour integral 


J = 


ze 


c (1 + e 2z ) 


vdz 


(9.34) 


around the rectangular contour shown in Fig. 9.3. Beginning at the lower left 
hand corner of the rectangle, 


J = 


xe 


rdx - 


+ 


/ —L (1 + e 2x ) 2 
f° (-L + iy)e~ L+i y 

L (1 + e 2 ( _i +*2')) 


r (L + iy)e L+iy 
'0 (l + e 2 ^ L+iy ^>Y 

T idy. 


r—L 


idy 


(. x + m)e x+l ' n 


dx 


II (l + e 2(x+i-K)y 

(9.35) 
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In the limit L — > oo, the second and fourth integral of (9.35) go to zero, since 
in this limit the magnitude of e 2 ^ L+ly ^ and e 2< -~ i+J!/ ) become very large or 
very small, respectively, compared to unity. Hence, we have 


lim J = 


( x + in)e x 


L^oo J_oo ( l + e 2x ) J oo (1 + e 2(x+iir)y 


r°° (x + in)e x+in 

I - oo (1-1 _ e 2(a:+i7r)) 2 


dx = 21 — in 


1-00 (9 ' 36) 


where we have used the expressions e x+ ™ = — e x and e 2 ( a: +* 7r ) = e 2x . As a 
result, the integral I to be evaluated is expressed in terms of J as 

1 in e x 

I = - lim J H / ^dx. (9.37) 

2 l ^oo 2 J (1 + e 2x f 

The contour integral J is readily evaluated by employing the residue the- 
orem. Looking back to the definition (9.34), we see that J has second-order 
poles at the values of z for which e 2z = — 1. These values are 

in 3in ( 1 \ 

* = ±y,±^-,"- ,±i [ N +2)^ 

where N is a nonnegative integer. Note that only the pole at z = in/2 is 
enclosed in the rectangle (see Fig. 9.3). Hence, using the ratio method (see 
Sect. 9.1.4) we have 


J = 27riR.es 


= 2ni • 2 


p'(in/2) —n(2 + in) 

q"(in/2) 4 


where p(z) = ze z and q(z) = (1 + e 2z ) 2 are constituents of the integrand in 
(9.34). 

The latter integral of (9.37) is evaluated by substituting w = e x , and it 
follows that 


/-oo (1 + e 2x Y 


1 0 (1 + W 2 ) 2 2 J (1 + w 2 ) 2 
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Thus, applying the residue theorem yields 

1 [°° dw 1 . „ ... d 2 1 

- / — = - • 2m ■ Resm = 7rthm— — — 

2 7 - oo (1 + w 2 ) 2 2 w—>i dw 2 (w + i) 2 

From (9.38) and (9.40), we finally obtain 

I = — — (2 + m) + — = — — (1 + in). 

4 4 4 


= \- (9-40) 


9.3.2 Fresnel Integrals 


We would like to derive the equations 


cos(kx 2 )dx 


sin (kx 2 )dx 


1 

2 



with a real positive constant k. These are known as the Fresnel cosine 
integral and Fresnel sine integral. Integrals of this type are encountered 
in the study of a phenomenon called diffraction, which is exhibited by all types 
of waves such as light and sound. 

In this connection we consider the integral 


I = j e ikz2 dz (k > 0) (9.41) 

around the contour shown in Fig. 9.4. The integral variable z becomes z = x 
on the segment along the real axis, z = Re (0 < 4> < 7t/4) along the large 
(ultimately infinite) arc, and z = x{\ + i) along the slanted segment defined 
by y = x. Therefore, with (1 + i) 2 = 2 i, we have 


lim (f e lkz dz = 


fOO 

/•w/4 

/ e ikx dx + lim 

/ e ikR e ' iRdcf) 

/ o R ^°° 

Jo 


M 

(1 + i) / e~ 2kx2 dx. (9.42) 

J OO 


Our objective is to evaluate the real and imaginary parts of the first integral 
on the right-hand side of (9.42). Then, evaluations of the other integrals shown 
in (9.42) complete the computation. 

First, we readily obtain 


Akz 


dz = 0, 


(9.43) 
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Fig. 9.4. Contour for evaluating the integral (9.41) 


since there are no poles within the contour of Fig. 9.4. 

Second, we consider the integral along the arc, which is given in the second 
term on the right-hand side of (9.42). On the large arc, we have 


Re 


ikRe 


cos(20)^ — kR 2 sin(20) ^ j^^ — kR 2 sin(2<t>) 


where the sign of sin(2(/>) is always nonnegative in the range 0 < <f> < n/4. 
Hence, 

lim R e ~ kR2 sin W =0, (9.44) 

R—> oo 

so that the integral along the arc vanishes in the limit R — » oo. In fact, 
l’Hopital’s rule states that for a > 0, 


lim 


R 


= lim 


1 


XXXXX XXXXX „ r) 

R—>oo e aR R — >oo 2aRe aR 


= 0 . 


Finally we examine the integral along the slanted segment, i.e. , the third 
term on the right-hand side of (9.42). To evaluate it, we consider the quantity 


J = 


-2 kx z 


dx = 


— 2kx' 


dx 


o ~ 2 k v z 


dy 


dx 


dy e 


—2 k(x 2 +y 


— oo 

2 I 2 \ 


In terms of the polar coordinates, it yields 


/»2-7T 


J = 


dr 


dd 


-2kr 2 


d(r 2 ) 




dd 


0 -2 kr 2 _ 71 

2k' 


and we have the Gaussian integral given by 
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so that 


(i+o r 

J oc 


— 2kr 2 i 1 i / 7T 

e 2fc dx = — — - 


8ft 


(9.45) 


Substituting the results of (9.43), (9.44), and (9.45) into (9.42), we find that 

1 + i Pif 


e ikx dx = 


2k' 


(9.46) 


Writing the exponential in trigonometric form and equating the real and imag- 
inary parts of both sides of (9.42), we obtain the Fresnel integral: 

r°° i Pk~ 

/ cos (kx 2 )dx = / sin (kx 2 )dx=-\ X 

Jo Jo 2 V 2ft 


9.3.3 Summation of Series 

Our final application of the residue theorem is the summation of a series 
En=-oc f( n )- U s ' n g this method, we can convert a certain type of series to 
simple forms such as 


E 


i 

(a + n ) 2 


sin 2 (tt a) 


(9.47) 


and 


\ ’ 


2x , 1 

/ 9 = coth x . 

^ ' :r z + n z 7r z x 

n—1 


This technique is particularly useful, for instance, to express a power series 
solution of a differential equation in a simple closed form. In fact, this device 
is generalized for various series summations as shown below. 


4 Theorem: 

An infinite series of functions f(n) with respect to an integer n is given 

by 

OO OO 

E /( n ) = - E Res (2>«n), (9-48) 

n— — oo n =— oo 

where Res ( g , a n ) is the residue of the specific function 


g(z) 


nf(z) 

tan(7T2:) 


at the nth pole of f(z) located at z = a n . 
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According to this theorem, we see that if the number of poles of f(z ) is finite 
and the values of Res ( g , a n ) are readily obtained, the series on the left-hand 
side of (9.48) is written in a simple form. 


Proof The key point is to use a function given by n/ tan(7rz). This function 
has simple poles at z = 0, ±1, ±2, • • • , each with residue 1 evaluated as 


lim — - 

z^n tan(7rz) 


• {z — n) = lim 


7 r 


2->ra 7T / COS 2 (7T2:) 


= 1, 


where we used l’Hopital’s rule (see Exercise 3 in Sect. 8.1). In addition, the 
function 7r/tan(7T2) is bounded at infinity except on the real axis. To derive 
(9.48), let us consider the contour integral 



tan(7rz) 


dz 


(9.49) 


around the contour C\ shown in Fig. 9.5. Here f{z) is assumed to have no 
branch points or essential singularities anywhere. Since only the pole at 2 = 0 
is found within C\, the contour integral equals 27 t* times the residue of the 
integrand at 2 = 0, which is /( 0), i.e., 



*7(z) 

tan(7r^) 


dz = 27ri/(0). 



Fig. 9.5. A sequence of rectangular contours to derive equation (9.48) 


Next, the integral around contour Ci is 

/ dz = 2lT i [/(0) + /( 1) + /(- 1) + Res (5, 01)] , 

Jc 2 tan(7T2:) 

where Res (g, ai) stems from the contribution of the pole of f(z) located at 
z = a±. Finally, for a contour at infinity, the integral must be 

[/( n ) + Res(s, a n )\ 


Coo tan(7rz) 


dz = 2m 



(9.50) 
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If \zf(z)\ — » 0 as \z\ — > oo, the infinite contour integral is zero so that we 
successfully obtain the equation: 


OO OO 

X /( n ) = - X Res (5>a„). * (9.51) 

Ti — — oo n =— oo 


9.3.4 Langevin and Riemann zeta Functions 

Our present aim is to establish the equivalence between Langevin’s 
function, 

cothx — (1/a:), 

and the sum 

2x 


X 


x 2 + n 2 n 2 


Letting f(z) = 2x/(x 2 + z 2 7r 2 ), and using the above equation, we obtain 


N 

X 

n=—N 


2x 


x 2 + n 2 7r 2 2m 


- <f> 7TC0t 7rzf(z)dz — VRes [ir C0t(7Tz)f(z)] , 

1 poles 


where C is a closed contour, say, a rectangle, enclosing the points 2 = 
0, ±1, • • • . Now let the length and width of the rectangle C approach 00 . 
As this happens, 


1 

2ni 


- (f n cot nzf(z)dz < ^ (f 7r|cot 
1 Jc 2 Jc 


7 TZ\ 


2x 


x 2 + n 2 TT 2 


\dz\ -> 0. (9.52) 


Hence, we have 


X 


2a; 


m —— 00 


X 2 + n 2 7T 2 


= —Res 

2x 
7 r 


(77 cot nz)2x 


x 2 + z 2 tt 2 iz=±ix/v 
cot(zx) cot(— ix) 


^ix/n —2ix/Tt 
= 2zcot(ix) = 2 cotli x. 


This result can be rewritten as 


^X 

m= 1 


2x 


— = 2 cothx 


or 


coth x = V 

rp / a 


2x 

-,2_2> 


x * — ' x 2 + n 2 n z 

m= 1 


(9.53) 


which establishes the result we stated at the outset. 
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Remark. To see that the integral in (9.52) vanishes as z — > oo, we observe that 


I cos 7T z I / cos 2 nx + sinh 2 ny 

\cotnz\ = E = \ —2 7T^- 

|sm7rz| y sin 7ra + sinh ny 

If we choose the rectangle whose vertical sides cross the ai-axis at a large 
enough half-integer, say, x = 10 5 + \ so that cos7ra = 0 and sin7nr = 1, then 
over these sides of the rectangle 


| COt 7Tz\ 


' sinh 2 7r y 
1 + sinh 2 Try 


|tanh7ry| < 1. 


Over the horizontal sides of the rectangle, lim z ^ 00 |cot7rz| = 1. Thus the 
integrand goes as |1/^ 2 | as \z\ —> oo, and the integral vanishes. 


If we integrate both sides of (9.53) from 0 to x, we get 


EM 1 


Hence, 


= In 


sinh x 


n i 


= In 


/ sinh a; 

V x 


n ' 


m = 1 


We may extend this result to all z in the complex plane by analytic contin- 
uation. Then setting x = id with 6 real, we obtain 


sin t) = 


n i 


This infinite product formula displays all the zeros of sin 6 explicitly. It repre- 
sents the complete factorization of the Taylor series and can, in fact, be taken 
as the definition of the sine function. 

By equating coefficients of the 9 3 term of both sides of the above equation, 
we obtain a useful sum: 

OO 1 2 

E l _ 7T 

n 2 6 ’ 

n= 1 

which is a special value of the Riemann zeta function, 


= E;v 
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Exercises 


1. Evaluate ( a + n ) 2 by considering the contour integral: 


I = 


1 


jdz, 


[ c tan(7T2) (a + z) 2 
where a is not an integer and C is a circle of large radius, 
Solution: In order to use equation (9.48), we define 

1 . . 7 T 1 


/(*) = 


( a + z y 


and g(z) = 


tan(7rz) (a + z) 2 


Since the integrand g{z) has simple poles at 2 = 0, ±1, ±2 • • • and 
a double pole at 2 = — a , evaluation of Res(c/, —a) completes the 
problem [see (9.48)]. To find the residue at 2 = —a, set z = — a + £ 
for small £ and determine the coefficient of £ -1 : 


7 T 1 

tan(7T2:) (a + z) 2 


7 r 1 

£ 2 tan(— 07r + ^7 r) 


| tan(— a7r) 




dz tan(7 rz) 


It follows from (9.54) that the residue at the double pole z = —a 
is 


dz tan(7 tz) 


sin 2 (7 tz) 


sin 2 (7ra) 


Therefore, it is readily seen from (9.48) that 


E 


n =— 00 


1 


7 r 


(a + n) 2 sin 2 


9.4 Argument Principle 

9.4.1 The Principle 

It may occur that a function f(z) has several zeros and poles simulateously 
in a domain D. If we denote the number of such zeros and poles by Nq and 
Noo, respectively, these numbers are related to one another as stated below. 
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4 Argument principle: 

Let f(z) be an analytic function within a closed contour C except at a 
finite number of poles. If f(z) yf 0 on C, then 

1 /' f'(z) 

— 6 J -±{dz = N 0 -N oo , (9.54) 

2?r* Jc f\ z ) 

where Nq and W*. are the numbers of zeros and poles of f(z) in C, respec- 
tively. Both zeros and poles are to be counted with their multiplicities. 


Proof By the residue theorem, the integral 


1 

27 n 


m 

m 


dz 


is equal to the sum of the residues of the logarithmic derivative of f(z) in D , 


/'<*) Alogf(z)] 

9{z) = 7w = ~S~ 


The only possible singularities of g(z) in D coincide with the zeros and poles 
of f(z). In order to determine the residue of g(z) at a zero of f(z), we observe 
that in the neighborhood of a zero a of the nth order, f(z) has an expansion 


f(z) = (z- a) n [ci + c 2 (z -a) -I ] , ci ^ 0. 


We therefore have 


f(z ) ={z- a) n f i(z), 

where fi(z) ^ 0 in a certain neighborhood of z = a. Hence, 
log f(z) = n\og(z — a) + log fi{z), 


and 


/'(*) n f[(z) 

f(z) z-a f^z) 7 


where the last term is analytic at z = a. It follows that the residue of g(z), 
which is called the logarithmic residue of f(z) at z = a is n, i.e., it is equal 
to the order of the zero of f(z) at z = a. If the zeros of f(z) in D are counted 
with their multiplicities, the sum of the logarithmic residues of f(z) at the 
zeros of f(z) in D will be equal to the number of zeros. 

We now turn to the poles of f(z) in D. If z = b is a pole of order to, we 
have near it an expansion 
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il i \ C 1 . . c m 

f ^-(^br + --- + — b +Cm+1 + '" 

= (^V [ci + C2( ^ 6) + '" ] 

/ 2 (g) 

(, z-b ) m ’ 

where / 2 (z) is analytic at 2 = b and / 2 (z) ^ 0. Hence, 

■TO) = m , &(*) 

f(z) z-b / 2 W 

which shows that the logarithmic residue of /(z) at a pole of /(z) of order m 
is — m. If the poles of /(z) in D are counted with their multiplicities, the sum 
of the logarithmic residues of f(z) at the points of f(z) in D will be equal to 
minus the number of these poles. Since g{z) has no singularities in D except 
at the zeros and poles of /(z), we have proven our theorem. £ 


Remark. If we replace /(z) in (9.54) by /(z) — a, this formula will yield the 
difference between the number of zeros and the poles of /(z) — a. Since the 
latter are identical with the poles of /(z), we find that 


1 

2ni 


f(z) 

f(z ) - a 


dz = N a — Noo, 


where N a indicates how often the value of a is taken by /(z) in D. 


Examples 1. For /(z) = z 2 and C : \z\ = 1, N 0 = 2 and = 0 so that we 
have 


7TT* = 2 - 
f{z) 


1 

27H 


c 


In fact, the integral reads 


1 

27 ri 


f(z) 


f(z) 


2ni 


2 z 


- — : ® . . , dz = - — : ® —rdz= - — : X 2 X 2ni = 2. 


2ni 


2. For f(z) = z/{z — a ) and C : |z| = R, No = 1 and 

1 if R > a, 

0 if R < a. 


Hence, we have 




0 if R> a, 

1 if R < a. 


(9.55) 


2™ Jc f(z) 

Indeed, f(z) = 1 + [a/(z - a)], /'(z) = -l/(z - a) 2 , f/f = (1/z) - [1/ 
(z — a)], which yields (9.55). 
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9.4.2 Variation of the Argument 

Equation (9.54) can be brought into a different form in which its geometric 
character becomes more apparent. If we write 

ip = argf(z), f(z) = \f(z)\e zv , 


we obtain 

hi = 

= [dlog\f(z)\ + idp] 

= hi dio "' m + hi dlf ' 


Recall that log w (z) is a many- valued function of w. If logic is continued 
along a closed curve that surrounds to origin, we shall not return to the value 
of logic with which we started. However, this many- valuedness is confined 
to Im(logie) = argic, i.e. , Re(logic) = log|ic| is single- valued. If we write 
w = f{z), it follows that 


dlog|/(*)| = 0. 


In fact, 


log |/( 2 ) | 


log 1/teO | - log 1/(301, 


and if the integration is performed over a closed contour, the terminals z\ 
and 22 of the integration coincide; moreover, owing the single-valuedness of 
log | f(z)\, the value of the integral is zero. Hence, we have 


1 

27T1 


m 

f ( 2 ) 



j>c d '-P> 


(9.56) 


where (p = arg f(z). 

To interpret (9.56), we observe that 


rz2 

/ dp = p(z 2 ) - p{zi) 

J z-\ 


arg f(z 2 ) ~ arg/(^i) 


is the quantitiative change in the argument of f(z), which is called the 
variation of the argument of f(z). The integral § c dp is therefore the 
total variation of arg f(z) if 2 describes the entire boundary C of the domain 
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D. It is clear that the value of this integral must be an integral multiple of 
27 t. If z describes C , the point /(z) describes a closed curve C", and if C 
surrounds the origin to times in the positive (counterclockwise) direction, the 
increase in arg f(z) along C' will be 2?ti7t. In view of (9.54) and (9.56), we 
obtain the theorem below. 


4 Theorem: 

Let the domain D be bounded by one or more closed contours C and let 
a function /(z) be single- valued and analytic apart from a finite number 
of poles. If Nq and N x denote the number of zeros and poles of /(z) in D , 
respectively, and /(z) ^ 0 on C, then 

^-A c = N 0 - A^, 

Z7T 

where A c denotes the total variation of arg f{z). 


9.4.3 Extentson of the Argument Principle 

The argument principle can be extended to the case in which /(z) has zeros 
or poles on the boundary C of the domain D. Suppose that f(z 0 ) = 0, where 
Zq is situated on C. Let /(z) be analytic at z o; then we have 

f(z) = {z- z 0 ) m fi(z), /i(z 0 ) ^ 0, 

if m is the multiplicity of the zero. In view of the relation 

log f(z) = rn log (z - z 0 ) + log fi{z), 


it follows that 

arg /(z) = warg(z - z 0 ) + arg/^z). 

At z = zo, fi(z) ^ 0 and log/(z) is analytic. Hence, arg/i(z) will vary 
continuously if z varies along C and passes through z = Zo , but the expression 
arg(z — Zo) shows a different behavior. Since this is the angle between the 
parallel to the positive axis through zo and the linear segment drawn from zo 
to z, it is clear that if Zo is passed arg(z — zo) jumps by the amount 7 r. The 
contribution of this zero to arg /(z) will be TO7r, i.e., one-half of what it would 
have been if the zero were situated in the interior of D. If z = Zo is a pole of 
order to, its contribution to arg /(z) will be —to tt. This follows immediately 
from the fact that /(^) _1 has a zero of order to at Zo and that 

log [/(z) -1 ] = -log/(z). 
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We therefore have the following extension of the argument principle. 

4 Extended argument principle: 

The argument principle remains valid if f(z) has poles and zeros on the 
boundary, provided that these poles and zeros are counted with half their 
multiplicities. 


9.4.4 Rouche Theorem 

As an application of the argument principle, we prove the following result, 
known as the Rouche theorem. 


4 Rouche theorem: 

If the function f(z) and g(z) are analytic and single-valued in a domain 
D and on its boundary C and if |</(z)| < |/(z)| on C, then the number of 
zeros of the function f(z) + g(z) within D is equal to that of zeros of f(z). 


Proof We have 


log [f(z) + g(z)} 

whence 

arg [f(z) + g(z)] 

On the contour C, we have 

It thus follows that the points 

w = 1 + > zGC (9.58) 

J \ z ) 

are all situated in the interior of the circle |1 — w\ < 1. Since this circle does not 
contain the origin, the curve (9.58) cannot surround that point. As a result, 
the total variation of the argument of (9.58) along C is zero. Hence, by (9.57), 
we have 

[f{z) + g(z)] = A c [f(z)] . 

Since neither f(z) nor f(z) + g{z) has poles in D , it follows from (9.54) that 
these two functions have the same number of zeros in D. X 


= log f(z) + log 


1 + 


/(*). 


= arg f(z) + arg 


1 + 


f(z)_ 


(9.57) 


m 


< 1 . 
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The application of Rouche’s theorem is illustrated by the following short 
proof of the maximum principle. If f{z) is analytic in D + C and there is a 
point Zq in D such that 


\f(z)\ < l/Oo)l for zeC, 

then it follows from Rouche’s theorem that the function f(zo) — f(z) and f(zo ) 
have the same number of zeros in D and the function f(z) — f(zo) has at least 
one zero there, namely, at 2 = zo . The assumption that \f{z)\ < \.f{z 0 )\ for 
z € C thus leads to a contradiction. 


Exercises 


1. Let Zj be the zeros of a function f(z) that is analytic in a circular domain 
D and let f(z) ^ 0. Each zero is counted as many times as its multiplicity. 
Prove that for every closed curve C in D that does not pass through a 
zero, the sum of winding numbers yields 


Y^ n ( c , z j) 

3 


1 

2m 


m 

m 


dz. 


(9.59) 


Solution: From hypothesis, we can write f(z) = (z — Zi)(z — 
Z2) ■ ■ ■ (z — z n )g(z ), where g(z ) is analytic and g(z) ^ 0 in D. 
Forming the logarithmic derivative, we obtain 

/'(*) _ 1 , 1 , , 1 , 9 '{z) 

f(z) z- Z! z- z 2 z- z n g{z) 


for z ^ Zj, and particularly on C. Since g(z) ^ 0 in D , Cauchy’s 
theorem yields § c g'(z)/g(z)dz = 0. Recalling the definition of 
n(C,Zj), we set the desired result (9.59). £ 

2. Show that an analytic function in a domain D that takes only real values 
on the boundary C of D reduces to a constant. 

Solution: Let £ = a + ib, b ^ 0, be a nonreal complex number 
and consider the values of f(z) — £ for z € C. If b > 0, say, we 
have lm[f(z) — £] = b > 0 since f(z) is real on C. The vales of 
f(z ) — £ are thus confined to the upper half-plane so that the 
curve described by f(z) — £ cannot surround the origin. Hence, we 
have Ac[f(z) — £] = 0. Furthermore, since f(z) — £ is analytic in 
D + C, it follows from the argument principle that f(z) — £ 7 ^ 0, 
i.e. , f(z) ^ £ in D. The same reasoning also applies to values £ for 
which b < 0. We thus conclude that f(z) does not take nonreal 
values in D. 
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Next we show that the above result means that f(z) reduces to a 
constant. Since f(z) is analytic in D, we have 


/'(*) 


lim 

h—> 0 


f(z + h) - f{z ) 
h 


lim 

h — »0 


f(z + ih) - f(z) 
ih 


where h — > 0 through positive values. Since f(z) is real throughout 
D , the first limit is real and the second limit is imaginary. They can 
therefore be equal only if they are both zero. Since z is arbitrary, 
it follows that f'(z) = 0 throughout D\ hence, f(z) = const. 4k 

3. Show that all zeros of polynomials 

p(z) = Z n + CL n —\Z n 1 + • • • + CL\Z + a 0 


are located within the region \z\ < Ro, where 

Ro = max {1 + |o„_i|, 1 + \a n - 2 \, • • • , 1 + |ai|, |a 0 |} . 

Solution: Let f(z) = z n , g(z) = a n ^iZ n ~ 1 + • • • + a\Z + ao, and 
let Rk = Ro + (1 /k) for an arbitrary fixed k G N. Observe that 


|dj | < Ro — 1 < Rk — 1 for j = 1, 2, • • • , n — 1 
and | ao | < i?o < -Rfe- Then, if | 2 :| = Rk, we have 
|fl(^)| < |a n _i|| 2 :| n_1 H h \ai\\z\ + |a 0 | 

< (Rk - W 1 + -" + (Rk~ *)Rk + Rk = Rk = \f(z)\- 

In view of Rouche’s theorem, f(z) and f(z) + g(z ) = p(z) have 
the same number of zeros within the region \z\ < Rk- Since f(z) 
has n zeros and p(z) is an ?rth-order polynomial, we conclude that 
all the zeros of p(z) have to be located within the region \z\ < Rk- 
Finally, we take the limit k — » oo (since k is arbitrary) to find that 
all the zeros of p(z) have to be located within \z\ < Ro- Jk 
4. Show that the equation z 3 + 3z+l = 0 has solutions whose absolute values 
are less than 2. 

Solution: Let 2 be on the circle \z\ —1. Then we have 

| z 3 1 = 8>3-2 + l> 3|z| + 1 > \3z + 1|. 

This means that there are three solutions to the equation z 3 + 

3z + 1 = 0 and that all of them satisfy \z\ < 2. 4k 
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The previous sections treated contour integrals whose integrand has no pole 
on the contour C. If a pole is located on C, the integrand diverges at the 
pole so that we cannot use ordinary integration methods. This difficulty is 
overcome by introducing a new concept called the principal value integral. 
To derive it, we consider an integral 

I = / ^Ldz (9.60) 

Jc z ~ a 

with the integration contour depicted in Fig. 9.6. In (9.60), a is assumed to 
be real without loss of generality. In addition, we assume that f(z) is analytic 
at Im z > 0, and behaves as z / 3 |/( 2 )| — » A (/3 > 0) as \z\ — > oo there. In order 
for the integral (9.60) to be defined, the contour C must be traversed in such 
a way as to avoid the pole at z = a. Then, since both f(z ) and 1 /{z — a) are 
analytic within and on C, (9.60) equals zero. Therefore, by breaking it up, we 
obtain the following expression: 


/(*) 

c z - a 


™-d* + f -M . dz 


rE 


/ a.-\-r 


z — a 


= 0 . 


(9.61) 


Here r is the radius of the small semicircle 7 centered at x = a and R is 
the radius of the large semicircle r centered at the origin. The radius r 
can be chosen as small as we please and R can be chosen as large as we 
please. 

Our current interest is to determine where the sum of the four integrals 
appearing in the second line of (9.61) converges in the limits of r — » 0 and 
R — > 00 . This is seen by evaluating the integrals along 7 and r given in 
(9.61). First, once we set z = Re 16 , the integral along the large semicircle r 
yields 




Fig. 9.6. Integration contour on which the pole of the integrand is located 
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hence, 


f /(*) 

Jr z- a 

m 


z — a 


-dz 


dz = i 


< 


f{Re w ) 

Re w - 


Re dO; 


R 


\f(Re i8 ) I dO, 


(9.62) 


\R- a\Rd 
where we have used the inequality 

\Re w - a\ = \J R 2 + a 2 — 2Ra cos 9 > \J R 2 + a 2 - 2 Ra = | R- a|. 

In the limit R — > oo, the right-hand side of (9.62) vanishes since f3 > 0. 
Therefore, the integral over the semicircle r can be made arbitrarily small by 
choosing R sufficiently large. 

Next, we write the integral along 7 as 


Jry z- a J 1 z - a 


f(z) - /(a) 


dz. 


z — a 


(9.63) 


By setting z — a = re™ , the first integral on the right-hand side is evaluated 
as q 

/(a) f ^ dz = if (a) f dQ = -inf(a). 

J ^ Z OL J ^ 

In addition, the Taylor series expansion of f{z) around z — a yields 


f(z) - /(a) 


z — a 


dz = f'(a) • iee™ dO 4- 


/» 


• ee ie ■ iee ie d6 + • • • = O(e), 


which means that the second integral in (9.63) vanishes in the limit r — > 0. 
Equation (9.61) thus yields 


lim lim 

R —> oo r — ^0 


/(a 0 


cR 


dx 


f(x) 


dx 


’ —R 


’ OL-\-r 


— inf (a) = 0. 


(9.64) 


Now we introduce a new notation as shown below. 


4 Principal value integral: 

The notation 



V [ ^ X) dx ~ lim 

\r m *+r t(x) A 


J~r x-a r *0 

J — r X - a J a+r x - a 


provides the principal value integral (or the Cauchy principal value) 

of f(z)/(z — a) for real a. 
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with this notation, (9.64) reads 

lim V f ^ dx = inf (a), 

R—> oo J — f i x — a 

where fix) is a complex-valued function of a real variable x. For the sake of 
brevity, we write this simply as 

V [ ^ dx = inf (a). (9.65) 

J_ 00 x — a 

This result provides a way to evaluate the contour integrals involving sin- 
gularities on the integration path. When we decompose f(x) in (9.65) as 
f{x) = f]i(x) + ifi{ x) and equate the real and imaginary parts, we obtain an 
important relation between /r and //: 


6 Hilbert transform pair: 



A pair of functions /r and // that satisfies the relations 


f R (a) = -V 1 



7T J- 

-oo x - a 


fi(a) =--V 

r /s(i v 

(9.66) 

7 r 

7-oo x - a 

is called a Hilbert transform pair. 




It readily follows from (9.66) that if fi(x) = 0, then /r(x) = 0. 


9.5.2 Several Remarks 


The principal value integral is seen as a way of avoiding singularities on a 
path of integration; we integration to the point just before the singularity in 
question, skip over the singularity, and begin integrating again immediately 
beyond the singularity. This prescription enables us to make sense out of 
integrals such as 



(9.67) 


l-R 


X 


Apparently, this integral seems to be zero, since an odd function is integrated 
over a symmetric domain. However, the singularity at the origin makes the 
integral meaningless unless we insert a symbol V in front of it. Following the 
prescription for principal value integrals, we can easily evaluate the principal 
value of (9.67): 
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In the first integral on the right-hand side, we set x = —y. Then 
V 


f R dx „ 

f r dy f R dx 

/ — = lim 

/ — + / — 

J-R X r^O 

Jr V Jr X 


where the two integrals within the brackets obviously cancel out. Conse- 
quently, we have 


r R 


dx 


(9.68) 


V / .= 0. 

J-R X 

We emphasize again that the integral (9.68) is completely different from the 
meaningless quantity in (9.67). 

As a further step, we evaluate the principal value integral defined by 

v i R JV*. 

J-r x-a 

It follows from the result of (9.83) that 

vT ^dx = v[ I | LLJL + J w ~ ./ w; | dx 


f-R 


x — a 


f-R L 


x — a 


x — a 


= /(<*) In 


R — a 
R T OL 


rR 


f(x) ~ f{a) 


1 —R 


x — a 


dx. (9.69) 


It often happens that the second integral in the second equation in (9.69) is not 
be singular at x = a; for instance, as in the case where f{x) is differentiable 
at x = a. In this case, the symbol P there can be dropped 

Particularly interesting is the behavior of (9.69) in the limit R — * oo, which 
yields 


/O) 


dx = V 


/( x) - f(a ) 


dx. 


x — a 


x — a 


(9.70) 


Hence, substituting (9.70) into (9.65), we obtain 

fi(x) - //(op 

x — a 

Ir(x) - f R (a) 


i 

Ma) - -PL 

//(«) = --P f 

n J-i 


dx , 
dx, 


x — a 

which are complementary expressions of a Hilbert transform pair (9.66). 


(9.71) 


Remark. Equation (9.70) is equivalent to 

V [ dx = 0, and thus V 


dx 


= 0 , 


which readily follows from the result (9.68). 
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9.5.3 Dispersion relations 

Mathematical arguments given so far are interesting in their own right, but 
their applications to physical sciences are also significant. In the following 
discussions, we show that general physical quantities associated with response 
phenomena satisfy the Hilbert transform relations given in (9.66) and (9.71). 
In the language of physics, the relation between corresponding parts of Hilbert 
transform pairs referred to as a dispersion relation, plays an important role 
in describing the properties of response functions. 

We begin by considering a physical system for which an input I (t) is related 
to a response R{t) in the following linear manner: 

1 

R(t)=~— G(t — . (9.72) 

v 2tt J — oo 

For example, I{t') might be the electric field acting on a physical object at a 
time t' and R(t) is the resulting polarization field at time t. We have assumed 
that G depends only on the difference t — t' because we want the system to 
respond to a sharp input at t 0 as expressed by I{t') = I 0 S(t' — t 0 ). In the same 
way, it would respond to a sharp input at to + r, i.e., at a time r later. For 
the first case, we have 

i r°° t 

i2 1 (t) = -=/ G(t — t')Io5(t' — to)dt’ = —^=G(t — t 0 ). (9.73) 

\ 2i7t J — oo 'v 27r 

and for the second, 

1 T 

R 2 (t) = —= / G{t - t')I 0 5(t' - t 0 - t ) d£ = — r =G(t - t 0 - r), 
v27r J — oo v 27 r 

or, in other words, 

R 2 (t + r) = d==G(f - to) = Ri(t). 

V27r 

Thus if we shift the input by r, the response is also shifted by r. 

Now, in order to derive the dispersion relation for the physical systems of 
interest, we consider the Fourier transform of (9.72). Using the convolution 
theorem, we find that 

r(w) = g(w)j(u), 

where 

1 f°° 1 C°° 

rM = ~f= / R{t)e^dt, gico) = -= / G{t)e^dt, 

V Z7T J — oo v Z7T J — oo 


jM = ~j= / 

v 27 t J-c 


Notably, it is possible to extend g{u>) into the complex 2 -plane, based on 
the assumptions that 
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(i) g(z) is analytic for Im .2 > 0, and 
(ii) g(z) — * 0 as 2 — > oo. 

Observe that (i) and (ii), are the conditions under which we derived the 
Hilbert transform pair (see Sect. 9.4.1). After some discussion, we see that 
g(z) arising from a G(t) that satisfies the necessary assumptions yields 


gn M = 

l 

-V 1 

r 


7T J 

-oo w 


9 1 M = 

--V 

r 

(9.74) 

7T J 

Goo U - U 


These relations between g r and gi are called the dispersion relations for 
g. The validity of assumptions (i) and (ii) that the function g{z) must satisfy 
is demonstrated in Sect. 9.5.6. 


9.5.4 Kramers Kronig Relations 


The term “dispersion relation” is often restricted to mean a relation between 
two functions whose arguments are quantitatively treatable experimentally. 
For instance, in (9.74) only a positive frequency (u> > 0) should actually be 
accessible, so they are not directly practical as they stand. In the following, we 
derive an alternative expression of the dispersion relations that involve only 
positive, experimentally meaningful frequencies. 

We first assume that G(t) is real, which is obvious from (9.73), where R\ 
and Iq are real. Hence, we may proceed as follows: 


g(z) 

9*i z ) 






G(t)e izt dt, 
G*{t)e~ iz * t dt = 




G(t)e- izU dt 


(9.75) 


As a consequence, we have 


9*(z) = g(-z*), 


which is referred to as the reality condition. 

Next let us assume 2 to be real (z = w) in order to discuss the behavior 
of g(z) on the real axis. It follows from the reality condition (9.75) that 

9rW) - igi(v) = 9r{-u) + igi(~v) 


or 

9r(u) = gR(-u>) and gi(u) = -g^-w). (9.76) 
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That is, g R and gj are even and odd functions of u>, respectively. Note that if 
the conditions in (9.76) are satisfied, the function 


G(t) = 


T J-c 


9 M' 


~ iwt dio 


becomes a real function. (The proof is left to the reader). 

Now we rewrite the first part of (9.74) as 

<te(w) = -V f + -V r —J———dto r . 

7T J_ OQ U) — U) IT J Q U' ~ U 

we rewrite u>' — > — u/ in the first integral and use (9.76) to obtain 


00 iSgiiiS) 


g R (uj) = -v[ 

* Jo 

and an identical procedure yields 

[°° g R (u/) 
giH = — v / —pi — 

7T J Q W' 2 - U> 


doj 1 


-du'. 


(9.77) 


(9.78) 


Eventually, the expressions (9.77) and (9.78) involve only positive, experimen- 
tally accessible frequencies. These equations are referred to as the Kramers— 
Kronig relations. 


9.5.5 Subtracted Dispersion Relation 


In deriving dispersion relations, it often happens that the quantity of interest, 
say g(z ), does not tend toward zero as \z\ — > oo. Furthermore, we are not 
usually fortunate enough to know the precise behavior of the quantity as 
\z\ tends to infinity. Nevertheless, if we at least know that the quantity is 
bounded for large values of |z|, the dispersion relation can be reformulated in 
the following way: 

Suppose that f(z) is analytic in the upper half-plane, and let no be some 
point on the real axis at which f(z) is analytic. Our aim is to derive the 
dispersion relation for f(x) under the condition that the asymptotic behavior 
of f(z) for z — » oo is unknown. Then, instead of f(z), we consider the function 


/ 0) - /(« o) 

z- a 0 


0(z), 


which is also analytic in the upper half-plane and not singular at z = no, and 
\<f>{z)\ — > 0 as \z\ — > oo owing to the boundedness of \f(z)\ for z — > oo. Thus 
in a manner, similar to the case in (9.65), we can write 




dx' . 


iTT(j){x ) = V 


— OO 


X' — X 
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In actuality, we have 

70) - /(« o) 


ITT 


x — ao 


v r /(x/) - /(ao) dX ' 

O' “ ®)0' - «o) 


= V 




-oo O' - 00' - ao) 


dx' - ^ a ° ^ V 
x — ao 


x' — x x' — ao 


so 


/ oo 

JT 

-oo 


fix') 


dx' 


dx ' , 


- a;) (a;' - a 0 ) 

,, f°° dx' r 

-/(a o)V — \-f{a 0 )V — . 

J - oo x' — x 7-00 21 - a 0 

The last two principal value integrals are equal to zero as we demonstrate later 
in (9.83). Hence, separating the real and imaginary parts, we finally obtain 


fn.{x) = /fl(a 0 ) + - — — V 


fi{x ) = //(a 0 ) - — —V 

7 r 


fi{x') 


dx' . 


* — oo 7' - *)7 7 - a 0 ) 

r , / 

-oo (»' - 2: )( ;z:, - a 0 ) 


(9.79) 


Relations of the type of (9.79) are referred to as once-subtracted disper- 
sion relations. Emphasis is placed on the fact that the relations (9.79) are 
free from the assumption that |/(~)| should vanish in the limit z — » oo. For 
them to be of use in a particular physical problem, we must have a means of 
determining, say, fn(a o) for some oq- 


9.5.6 Derivation of Dispersion Relations 

This subsection provides a proof of the dispersion relation (9.74). We shall see 
that by making a few very reasonable assumptions about the system in ques- 
tion, we can show that the real and imaginary parts of the physical quantity 
g(u>) are intimately related to one another for real values of w (i.e., a disper- 
sion relation). The key assumption is the causality requirement: we may 
say that causality of the function G(t) implies the analytic properties of g(z) 
in the upper half-plane and thus verifies the dispersion relations with respect 
to g(uS) on the real axis. 

Toward this end, let us consider what can be said about G(r) on general 
physical grounds. First to be noted is that an input at t should not give rise 
to a response at times prior to t, i.e., G(r) = 0 for r < 0. Thus we have 

R{t) = f G(t — t')I{t')dt' , 

J — OO 


(9.80) 
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which shows that the response at t is the weighted linear superposition of all 
inputs prior to t, which is the causality requirement. 

Secondly, the possibility that G(r) is singular for any finite r is excluded 
because, on physical grounds, the response from a sharp input given by 

R(t) = ~/==G(t - t 0 ), t>t 0 

V 2n 

must always be finite. 

Finally, it is assumed that the effect of an input in the remote past does 
not appreciably influence the present. This may be stated as the requirement 
that G(r) — > 0 as r — * oo, since the response to any impulse dies down after 
a sufficiently long time (i.e. , any system has some dissipative mechanism). 
Furthermore, G(r) should vanish faster than r” 1 so that it becomes integrable. 
Recall that g(z) is defined through an integration of G(t) with respect to t. 

The following three points summarise our physically motivated assump- 
tions on G(r): 

(i) G(r) = 0 for r < 0, 

(ii) G(r) is bounded for all r, and 

(iii) |G(t)| is integrable, so G(r) — » 0 faster than 1/r as r — > oo. 

We demonstrate below that these three assumptions for G(t) lead naturally 
to the two conditions for g(z) under which we have derived the dispersion 
relation of g(u>). 

First, we show that these three conditions require that \g(z) \ — » 0 at 2 — » oo 
on the upper half-plane. It is possible to write 

flM = ~^= / G(t)e^dt. 

V zir J o 

We extend this relation into the complex plane by using the definition 

1 r°° . i r°° 

g(z) = ^=J o G{t)e'* t dt=-j=J G(t)e mt e~ vt dt, 

where we have written z = u> + iij. We now restrict our attention to the 
upper half-plane (?? > 0), where the term e~ vt is a decaying exponential. For 
0 < 9 < 7T, it reads 

1 r°° 

\ g (z)\< M e"f 

V zir Jo 

where we have replaced G(f) by its maximum value M in view of assumption 
(ii) above. Hence, we have 

\g{z)\ < M ° . 

■\Z27rjzj sin 9 

This means that for 0 < 9 < n, \g{z)\ — » 0 as \z\ — » oo. On the other hand, 
when 9 = 0 or n, we have 
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i r°° 

g(w ,?7 = 0) = —= / G(t)e lult dt. 
v 27 r Jo 

This results in Parseval’s identity: 

/ OO i /»00 

l.9(w,r? = 0)| 2 dw = — / |G(t)| 2 cft, 

-oo «/ — OO 


where both sides of improper integrals converge. (See Sect. 3.4.2 for the con- 
vergence conditions of an improper integral.) Thus \g(u>,g = 0)| vanishes as 
u> — » oo. As a result, \g(z)\ — » 0 as \z\ — » oo in the whole region of 0 < 9 < 7r, 
i.e., in any direction in the upper half-plane. 

Now we want to show that g(z) is analytic in the upper half-plane. Using 


9(z) = 


1 r°° i r c 

-i= / G(t)e lzt dt = -= / 
y/2i r Jo Jo 


G(t)e iut e~ rit dt, 


(9.81) 


we see that for 77 > 0 , 
d n g _ _J_ f°° 
dz n 7 0 


G(t)^~e izt dt= 1 


dz r 


\/2t f. 


t n G(t)e iut e~ r ' t dt. (9.82) 


The integrals in (9.82) are uniformly convergent owing to the term e~ vt {g > 0, 
t > 0). Thus 3 ( 2 ) is analytic in the upper half-plane (77 > 0). Hence, for any 
g(z) arising from a G(t) that satisfies assumptions (i), (ii), and (iii) , we can 
proceed according to the argument in Sect. 9.4.1, and we finally obtain the 
dispersion relation (9.74). 


Exercises 

1. Prove that 

^ f R dx 

V / = In 

J-R 

when —R < a < R. 

Solution: We write 


x — a 


R — a 
R T cl 


„ [ R dx 

V / = Inn 

J-R 


x — a £ — *o 


f a ~ £ dx 
- R 


X — £ 


r 

' e+a ' 


dx 


Setting x = —y in the first integral on the right-hand side, we find 
that 


dx 


= lim 


l — R X — a £^0 Jr y + CL 


dy 


+ ln(i? — a) — In £ 


= lim [In e — ln(i? + a) + ln(i? — a) — In e] 


= In 


R — a 
R T a 


(-R < a < R). Jit 


(9.83) 
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2. By using the formula (9.71), prove that 


sm a; 

ax = 7 r. 

x 



Solution: Consider the function f(z) = e lz . This function is an- 
alytic everywhere, and if we write z = Re xS , then |/(z)| — + 0 as 
R — > oo for all 6 such that 0 < 9 < it. In this case, Jr(x) = cosx 
and fi(x ) = sina:, so using (9.71), we obtain 


/ OO 

(sin x — sin a)/(x — a)dx. 

-OO 


Since sin a;— sin a = 2 sin[(a:— a) /2] cos[(a;-|-a) /2], there is no singu- 
larity of the integrand at x = a. For the special case a = 0, we find 
that 1 = (1/ 7r) (sinx/x)dx, i.e., f^° (sinx/x)dx = n. From 
this result, we also obtain J 0 °° (sin x/x)dx = 7r/2 by symmetry. Jit 

1 f°° e ixt 

3. Show that the integral S(t) = lim / dx reads 

27 rz £ — x — is 


S(t) = 


1, t > 0, 
0. t < 0. 


Solution: Taking the contours Im(z) > 0 for t, > 0, and Im(z) < 0 
for t < 0, we have the desired result, which is the integral repre- 
sentation of Heaviside’s step function. X 
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Conformal Mapping 


Abstract Conformal mapping refers to transformation from one complex plane 
to another such that the local angles and shapes of infinitesimally small figures 
are preserved. This special class of mapping is indispensable for solving physics 
and engineering problems that are expressed in terms of complex functions with 
inconvenient geometries. In this chapter we show that a problem can be drastically 
simplified by choosing an appropriate mapping, which allows us to evaluate the 
solution using elementary calculus. 


10.1 Fundamentals 

10.1.1 Conformal Property of Analytic Functions 

We are concerned here the mapping properties of an analytic function 
w = f(z) in a domain D on the 2-plane into the w-plane. Through the map- 
ping, any line drawn on the 2-plane results in a line on the w-plane. Partic- 
ularly when / = u + iv is analytic, the transformation is angle-preserving 
or conformal. This means that through the transformation from (x,y) to 
( u , v ) , the angle between the crossing lines on the to-plane is equal to the angle 
between the crossing lines on the 2-plane (see Fig. 10 . 1 ). In physics and en- 
gineering, the subject derives its usefulness from the possibility of transform- 
ing a problem that occurs naturally in a rather difficult setting into another 
simpler one. 

Let D be a domain on the 2-plane, and let and F2 be two differentiable 
arcs lying in D and intersecting at a point 2 = a in D. If f(z) is an analytic 
function in D, the images /(Ti) and /(T 2) are differentiable arcs lying in a 
domain D' = f(D) and intersecting at a point a' = f(a). Then we say the 
following: 
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w=f(z) 



Fig. 10.1. Angle-preserving property of a conformal mapping w = f(z) 


4 Conformal mapping: 

The mapping w = f{z) is conformal at 2 = a if for every such pair of 
arcs, the angle between the arcs A and A intersecting at z = a on the 
2 -plane is equal to the angle between the arcs /(A) and /(A) at their 
intersecting point /(a) on the tc-plane. 


The mapping is said to be conformal in D if it is conformal at each point in D. 
We shall see that if a function w = f(z) is analytic, it is necessarily conformal 
except at a finite number of specific points; this fact is formally stated below. 

4 Theorem: 

Given an analytic function f(z), the mapping w = f(z ) is conformal at 
2 = a if and only if /'(a) ^ 0. 


Proof For proving sufficiency, we consider the arcs A and A given paramet- 
rically by 

Z\ = A(i) and z 2 = A(f) (0 < t < 1) 

and assume that z 1 , Z 2 are points on A, A at a short distance t from z = a. 
Then, from the relation 


z\ — a = £e ia , Z 2 — a = fe l/3 , 


we have the ratio 

~~ a — p i(P-a) 

Zl - a 

As l — > 0, /3 — a must approach the angle 9 between the curves on the 2 -plane. 
That is, 

9 = lim ang 
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For the angle 9 between the arcs of /(A) and /(A) at /(a), we have 


9 = lim ang 

e->o 


= lim ang 
£->0 


7(a) - f(g) ' 

_f(zi) - f(a) 

' f{z 2 ) - f(a) 
Z2 - a 


■ (22 - a) 


f(zi) - f(a) 
Z\ — a 


■ ( Z\ - a) 


= lim ang 


' f'(a) ■ (z 2 

/'(a) • (21 



if f'{a) ± 0. 


Thus, the condition /'(a) =£ 0 is necessary. Conversely if f^ n \a) 
n = 1, 2, • • • and f^ p \a) ^ 0, near z = a we have 


( 10 . 1 ) 
0 with 


f(z) = f(a) + 0[(z- a) p ] . 


Thus, we get 


lim arg 


lim arg 
e-> 0 


p lim arg 

e-* 0 


'f(z 2 ) - f(a)~ 


/( 21 ) - /(a) 

{z2-a)P' 


(21 — a) p 


fz 2 -a\ 


\zi-aj 



which shows that the angle is magnified by p. Therefore, if the mapping 
w = f(z) is conformal, we necessarily have p = 1, which completes the proof 
of the sufficiency of the condition. Jit 


10.1.2 Scale Factor 


There is another important geometric property that analytic functions pos- 
sess: whenever f(z) is analytic, any infinitesimal figure plotted on the 2 -plane 
is transformed into a similar figure on the w-plane with a change in size but 
with the proportions (and angles) preserved. We prove this by considering the 
length of an infinitesimally small quantity df given by 


du du , . . . 

-Jx + -Jy)+,\ 


( dv dv 1 

—dx + — dy 
\ dx dy 


df = du + idv 


(10.2) 
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Its square length reads 


\df I 2 


du du 

- 7 —dx + — dy 
ox ay 


dv , dv 

- 7 —dx + — dy 
ox dy 


2 




+2 


/ du du 
\dx dy 


dv dv 
dx dy 


dxdy. 




Substituting the Cauchy-Riemann relations into (10.3), we obtain 




\df\ 2 = h 2 \dz\ 2 , where h = \ — 

~V \wy/ v \ ux / \ u 

(10.4) 

The quantity h is known as a scale factor and measures a magnification ratio 
of the elementary lines through the transformation w = f(z). From (10.4), it 
readily follows that 

df I 


h = 


dz 


(10.5) 


We see from (10.5) that since df /dz is isotropic, the scale factor h is also 
isotropic (i.e., independent of the direction of dz) for any analytic function /. 
This means that any infinitesimal figures on the 2 -plane are transformed into 
similar figures on the tu-plane with a change in their size by h = \df /dz\. 

Note that the magnitude of h depends on points 2 and may vanish at 
points where f'(z) = 0. Points where f'(z) = 0 are called critical points of 
the transformation w = f(z), and at these points, the transform becomes non 
conformal. The simplest example is 


f(z) = z 2 


for which we have 

ft. = |/(0) | = 0. 

In fact, when two line elements passing through 2 = 0 make an angle (3— a with 
respect to one another, the corresponding lines on the iu-plane make an angle 
of 2 {(3 — a). Thus mapping is not conformal at 2 = 0. In general, the region 
in the neighborhood of the point at which h = 0 on the w-plane becomes 
greatly compressed. In contrast, the corresponding region on the 2 -plane is 
tremendously expanded. 


10.1.3 Mapping of a Differential Area 

The scale factor h given in (10.4) can be derived in a different way by consid- 
ering the conformal mapping of a differential area. Let f(z) be a conformal 
mapping that transforms any points in D of the 2 -plane onto a region S of the 
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w-plane. In the domain D , we define a rectangular differential area element 
with sides of the rectangle parallel to the x and y- axes. These sides are given by 

dz\ = dx and dz 2 = idy , 

The images of dz± and dz 2 are differential curves in the w-plane given by 
dw i = dui + idv\ and dw 2 = du 2 + idv 2 - 


Note that the differential area element of the rectangle in the z-plane reads 
dA z = dxdy and that of the parallelogram in the w-plane is 

dA w = |Im(dwJ(iw2)| • 

Since dzi = dx and dz 2 = idy , the images of these line elements can be written 
as 


df 


dw 1 = A-dx = — — |- i z~ dx 


and 


dx 


1 df. 


du . dv 


dx dx 


( du .dv 


dw 2 = -zz-idy = w- + iz~ dy. 


i dy 


\ dy dy 


Therefore, dA w is given by 
dA w = |Im(duiidw2)| = 
where 


du dv du dv 
dx dy dy dx 


dxdy = dA z 

9{x,y) 


( 10 . 6 ) 





du du 

d(u, v) 

du dv du dv 


dx dy 

d(x,y) ~ 

dx dy dy dx 


dv dv 
dx dy 


is called the Jacobian determinant of the transformation. Since f(z) is 
analytic, u and v satisfy the Cauchy-Riemann relations over the region R , so 
the Jacobian determinant can be written as 


d(u, v) 
d(x,y) 


du 

dx 


du 

dy 


dv 

dx 


dv 

dy 


(10.7) 


This provides a physical interpretation of the Jacobian determinant d(u,v)/ 
d(x, y)', namely, it is identical to the square of the same factor h introduced 
in (10.4). 


10.1.4 Mapping of a Tangent Line 

We consider the mapping of a tangent line. Let C be a curve in the z-plane 
and r be the image of C in the w-plane (see Fig. 10.2). A differential segment 
dw along r is related to the differential segment dz along C by 
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dw = ~j~dz = f'{z)dz. ( 10 . 8 ) 

We suppose wq to be a point on r that is the image of zq on C. Then from 
(10.8), the tangent to T at wo, denoted by r(u> 0 ), is related to the tangent to 
C at zo, denoted by t(z o): 

= f(z 0 )t(zo), (10.9) 

z=z 0 

where A parametrizes the curve of F on the w-plane. 

An immediate consequence of equation (10.9) is that if f'(zo) = 0, the 
tangent t(zo) on the 2 -plane cannot be related to the tangent t(wq) on the 
wj- plane. The point zq that satishes f\z ) = 0 is called a critical point on 
the curve. For simplicity in the following discussion, we assume that the curve 
C does not contain any critical points. 

The characteristics of the mapping (10.9) become clear by employing the 
polar form. 

rK) = |r(u; 0 )|e^), f'(z 0 ) = \f(zo)\e^K and *(*,) = |i(*>)|e w( *°>. 

( 10 . 10 ) 

The first equation shows that t(wq) is oriented at an angle ip(wo) to the u- 
axis; similarly, the third one shows that t(zo) makes an angle 8(zo) with the 
x-axis. It follows from (10.9) that 

|rK)|e^ o) = \ f'(z 0 )\\t(z 0 )\e i ^+ e ^. 

Thus the magnitude of t(wq) and its argument read 

l r ( w o)| = \f\zo)\\t{z 0 )\ and ^(^o) = 0(-o) + ^(^o)- 

Each equation gives us the properties of the conformal mapping of a tangent 
line as follows: 


. . dw 

T( ""> = dA 




(i) The magnitude of the tangent |t( 2 0 )| is modified by the scale factor 
\f(z 0 )l thus being enlarged or shrunk by the mapping. Since \f'(zo)\ 
depends on zq, the magnification varies from point to point on C. 




Fig. 10.2. Conformal mapping of a tangential line 
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(ii) The angle between the tangent t(z) and the a;-axis at Zq differs from the 
angle between the tangent t(w) and the u - axis at Wq. The difference is 
determined by the argument of f'(zo), denoted by <f>, called the argument 
of the mapping; <j> also depends on 20 and thus varies from point to point 
on C. 


10.1.5 The Point at Infinity 

For later use, we introduce a few concepts that are at the basis of further 
investigations on conformal mapping. Our aim is to understand the way in 
which the entire spherical curved surface is mapped conformally onto the 
entire flat plane with a one-on-one correspondence. This is achieved with the 
help of a stereographic projection between the complex plane and an artificial 
sphere as described below. 

Let us consider a sphere of radius R (for convenience, R is taken as 1/2) 
such that the complex plane is tangential to it at the origin, as shown in 
Fig. 10.3. The point P on the sphere opposite the origin (called the north 
pole, for convenience) is used as the “eye” of the stereographic projection. We 
draw straight lines through P that intersect both the sphere and the plane. 
These lines permit a mapping of point z on the plane onto the point C, on 
the sphere (see Fig. 10.3). In this fashion the entire complex plane is mapped 
onto the sphere (called a Riemann or a complex sphere). 

As to the properties of the Riemann sphere, the following statements can 
be verified without much difficulty. 

1. Straight lines in the 2 -plane are mapped onto circles on the sphere that 
pass through P. 


P 



Fig. 10.3. Riemann sphere 
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2. The images of intersecting straight lines on the plane have two common 
points on the Riemann sphere, one of which is P. 

3. The images of parallel straight lines on the 2-plane have only the point P 
in common, and they have a common tangent at P. 

4. The exterior of a circle \z\ = R with R 1 is mapped onto the interior 
of a small spherical cap around point P. As R — ■> oo the cap shrinks to P. 

Note that the point P itself has no counterpart on the 2-plane. Never- 
theless, it has been found convenient to adjoin an extra point to the 2-plane, 
known as the point at infinity, in such a way that a curve passing through P 
on the Riemann sphere is the image of a curve on the 2-plane that approaches 
the point at infinity. 


4 Point at infinity: 

The point at infinity 2 = oo is defined as the point z that is mapped 
onto the origin 2 = 0 by the transformation z = 1/2. 


The importance of the point at infinity is greatly enhanced once we appreci- 
ate the conformal property of the stereographic projection: i.e. , if two curves 
intersect on the 2-plane at an angle 7, then their images on the sphere inter- 
sect at the same angle. This conformal property permits the definition of the 
angle between two parallel straight lines on the 2-plane, i.e., the angle that 
their images make on the sphere at point P. (Indeed this angle is equal to 
zero as noted in 3 above.) 

10.1.6 Singular Point at Infinity 

The concept of a point at infinity is closely interwoven with the study of 
singularities of analytic functions. The notion of analyticity can be extended 
to a point at infinity by the following device: A function f(z) is considered to 
be analytic at infinity if the function 

is analytic at 2 = 0. A more precise statement on this mater is given below. 


4 Extended definition of conformal mappings: 

A function w = f(z) is said to transform the neighborhood of a point 20 
conformally into a neighborhood of w = 00 if the function 77 = l//(2) 
transforms the neighborhood of z$ conformally into a neighborhood of 

?7 = 0 . 
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Example The mapping w = I /0 is conformal at the origin 0 = 0. Initially, 
the function f(z) = \/z is not defined at z = 0; however, the subterfuge 
based on the Riemann sphere makes the mapping w = \/z meaningful (and, 
furthermore, conformal) at z = 0. Note that it is also conformal at 2 = oo 
even though the derivative f'(z) approaches zero as z — > oo. 

Owing to the above convention, it becomes possible to introduce the con- 
cept of a pole at infinity, a branch at infinity, and so on, through the 
corresponding behavior of g(z) at the origin. In fact, owing to our convention, 
a function f(z) = e z that has no singularities in the original 0 -plane comes to 
possess an essential singularity at infinity. Other functions that have no 
singularities (e.g., all the polynomials in z) are also found to have a breakdown 
of analyticity at infinity. In contrast, functions that are analytic at infinity 
possess at least one singularity for some finite value of z. The natural conjec- 
ture is that there may not be a perfectly analytic function. This problem has 
actually been resolved and is embodied in the theorem below. 

4 Entire function: 

A function f(z) whose only singularity is an isolated singularity at the 
point at infinity 0 = oo is called an entire function (or integral func- 
tion). If this singularity is a pole of mth order, then f(z) must be a poly- 
nomial of degree to. 

Liouville theorem: 

The only function f(z) that is analytic in the entire complex plane as 
well as at the point at infinity is the constant function f(z) = const. 


Remark. In some texts the term “complex plane” is tacitly assumed to mean 
the extended complex plane with the point at infinity included. Certain 
theorems may then be stated more conveniently. However, one should never 
forget that while there is a point at infinity, there is still no such thing as a 
complex number “infinity” in the sense that it possesses the algebraic prop- 
erties shared by other complex numbers. 


Exercises 

1. Suppose that two differential curves on the 0 -plane, meet at a point Zq at 
which f( 0 O ) = /"(0b) = ••• = = 0 and f^ m \z 0 ) ± 0. Show 

that the angle 9 between the two curves is magnified by to times through 
the conformal mapping w = f(z). 

Solution: From hypothesis, f(z) can be expanded in the 

neighborhood of the point 0 q as 
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f( z ) — f{ z o) + C m ( z — z o) m + c m+l( z ~ z o) m+1 + ' ' ' ) 


where c m 0. Then, by the same scenario as we used in deriving 
(10.1), the angle 9 between the mapped arcs at f(zo) reads 


9 = lim arg 
o 


f( z 2) - f{ z p) 
f( z i) - f{zo) 


lim arg 

e->o 


z 2 ~ z O 

= m inn arg 

z\ — zq 


m9. £ 


/ z 2 - z o 

V z i ~ z o 


m 


2. We say that the mapping w = f(z) is locally one-to-one at zq if f(zi) y^ 
f(z 2 ) for any two distinct points z\ and z 2 within the circle | z — 2o| < 5 
with some S > 0. Show that w = f(z ) is locally one-to-one at Zq if f(z ) is 
analytic at Zq and f(z 0 ) y^ 0. 

Solution: Let f(zo) = ct and take <5 > 0 small enough so that 

/(z) — a has no other zero in \z — Zo\ <5. In view of the theorem 
regarding the isolated property of zeros, such a <5 can always be 
found. The argument principle says that 



m 

f ( z ) - a 


dz, 


where C is a circle \z — zq\ = S. Denoting T = /(C), we have 


1 

27T* 



1 f dw 


27 ri 


P 


for any /3 satisfying \ f3 — a\ < e with sufficiently small e. If we take 
6' < 6 so that 


D = {z; \z - z 0 1 <S'} C f 1 [ D * = {re; \w - a\ < e}] 
it follows that for any 21,22 € D, 


1 = 

or equivalently, 
1= 1 


1 


dw 


1 


2iri Jr w- /( 21 ) 2ni J r w - /(z 2 ) 


dw 


f( z ) 


dz = - — 7 


2 m J r /(z) - /( 21 ) 2ni J r /(z) - /(z 2 ) 


f( z ) 


-dz. 


This means that each function f(z) — /( 21 ) and f(z) — /(z 2 ) has 
only one zero inside the circle \z — Zq\ = 6. Therefore, we conclude 
that f(zi) y^ /(z 2 ) if 21 y^ z 2 . £ 
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10.2 Elementary Transformations 

10.2.1 Linear Transformations 

The most simple conformal mapping w = f(z) would be the following: 

4 Linear transformation: 

w = az + (3, ( 10 . 11 ) 

where a and (3 are complex numbers. 


A linear transformation generates a translation plus a magnification and a 
rotation of a polygon, but does not affect its shape. Thus, for example, a line 
maps to a line, a rectangle maps to a rectangle, a circle maps to a circle, etc. 

To appreciate the above statement, we first consider the particular case of 
a = 1. From (10.11), we have 


w = z + f3, (10.12) 

which describes a translation by the constant (3 of the points being mapped. 
Obviously, a translation does not modify the length of a line or its orientation, 
only changes its position with respect to the coordinate axes. Since a polygon 
is constructed from three or more lines, the size and orientation of a polygon 
are not affected by a translation; only the position of the polygon is changed. 

Next we consider the case of f3 = 0. When we express a in polar form, the 
linear transformation becomes 


with a constant argument 7 . Then, the line between two points transforms as 

w\ — w 2 = \a\e ll (z\ — Z 2 ) = |a| • | z\ — Z 2 \e l ^ 1+0 ' > . 

Therefore, the length of a line in the 2 -plane, \z\ — z 2 1, becomes magnified 
by a constant factor a) and the line is rotated through an angle 7 . Thus, 
the lengths of the sides of a polygon and the orientation of the polygon with 
respect to the axes is modified. Nevertheless, its shape remains unchanged by 
the linear transformation with (3 = 0. 

We have seen that the values of a and (3 straightforwardly determine the 
image of a polygon in the 2 -plane under a particular linear transformation. 
Conversely, if one knows the coordinates of two points on the original polygon 
in the 2 -plane and the images of those two points in the w-plane, one can 
determine a and (3 and thus the linear transformation. 
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10.2.2 Bilinear Transformations 


There is another important conformal mapping referred to as the bilinear 
transformation (or the fractional or Mobius transformation): 


4 Bilinear transformation: 


az + (3 
72 + <5 ’ 


(10.13) 


where a, (3 , 7 and <)' are complex numbers satisfying the relation aS— (3j 7^ 0. 


The condition a8 — (3 7 7^ 0 ensures that 

df aS — (3j 

dz (7 z + 5) 2 

is nonzero at any finite point of the plane. Accordingly, the bilinear transfor- 
mation (10.13) possesses the one-to-one property because if f(zi) = 7(22), 
then 

az 1 + (3 _ az-i + (3 
721 + 5 722 + 5 7 

which implies (a< 5 — /3j)(z\ — Z 2 ) = 0, and thus z\ = 22. 


Remark. 

1. If 7 = 0, the bilinear transformation (10.13) reduces to a linear transfor- 
mation, which has already been discussed. Thus, we require that 7 7^ 0 in 
what follows. 

2. The function f(z) — ( az + f3)/('yz + 5) serves as a general solution (see 
Sect. 15.1.4) of the differential equation: 


1 - l (Kr 1 =0 


r 


r 


which is called the Schwarz differential equation. 


Observe that the mapping (10.13) has two apparent exceptional points: z = 00 
and 2 = —6 / 7 at which w diverges. It is possible to weed out these exceptions 
by extending the definition of conformal representation such that the point 
at infinity is included. With such an extension, the conformal property of 
the transformation (10.13) at the two points is recovered, even though the 
function f(z) itself diverges. Similarly, we cay say that w = f(z) transforms 
the neighborhood of 2 = 00 conformally into that of a point wq if w = </>(£) = 
/(!/£) transforms the neighborhood of £ = 0 conformally into that of the 
point wq. 
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A particularly interesting example of the bilinear transformation is 

w = f(z ) = Z ll , (10.14) 

Z *.Q 

where Im( 2 o) yf 0. This transformation maps the upper half-plane of the 2 - 
plane including the a;-axis, onto the unit circle centered at the origin of the 
tu-plane. This is demonstrated in Exercise 1. 


10.2.3 Miscellaneous Transformations 


In what follows, we note several elementary transformations that facilitate a 
better understanding of the conformal nature of analytic functions. We shall 
see that any conformal transformation may be regarded as a transformation 
from Cartesian to orthogonal curvilinear coordinates. 

Example 1. w = z 2 , w = yfz 

Assume a conformal mapping defined by 

w = z 2 . (10.15) 

Setting z = x + iy and separating the real and imaginary parts, we have 

x 2 — y 2 = u, 2 xy = v. (10.16) 


Thus, the straight lines parallel to the x- and y - axes in the 2 -plane denoted 
by 

x = a and y = b 

are mapped onto rectangular hyperbolas in the ui-plane given by 

2 2 
2 v , v ,2 
u = a — —7 and u = —r^ — 0 , 

4a 2 46 2 

respectively. This is shown schematically in Fig. 10.4. 

Another important feature of the mapping (10.15) is found by expressing 
z and w in polar coordinates: 


2 


= pe 


i(p 


w = re 


id 


On substitution in (10.15), we obtain 

r = p 2 , 6 = 2cf). 


(10.17) 


Hence, the upper half of the 2 -plane, 0 < </> < 7 r, goes into the entire w- 
plane, 0 < 0 < 27r; the lower half also goes into the entire tr-plane. In other 
words, points 2 and —2 in the 2 -plane obviously go into the same point in the 
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w-plane. This suggests the possibility that some distinct geometric figures in 
the z-plane may go into coincident figures in the tu-plane. 

Next we consider the transformation: w = yfz. In terms of polar forms, it 
reads 

V0 = 

so that we have 

r = \fp, 0 = ^ + n7r . (10.18) 

Owing to the additional term mr in the latter equation in (10.18), a half 
revolution in the 0 -plane corresponds to one complete revolution in the 
rc-plane. This is obviously a manifestation of the multivaluedness of the root 
function. The mapping of the upper half of the 0 -plane onto the iu-plane is 
illustrated schematically in Fig. 10.5. 




Fig. 10.5. Mapping w = yT 
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Example 2. w = e z , w = log z 
In the case of 

w = e z , (10.19) 

there are simple relationships between the Cartesian coordinates in the 2 -plane 
and the polar coordinates in the re-plane 

re l6 = e x+ly = e x (cos y + i sin y)\ i.e., r = e x , 9 = y. 

The lines x = const., parallel to the y-axis, become concentric circles in the 
w-plane; the lines y = const., parallel to the x-axis, become rays emerging 
from the origin. Accordingly, a strip of the 2 -plane bounded by y = yo and 
V = Vo + 27t goes into the entire re-plane. 

In the inverse of (10.19) 

2 = log w, x = log r, y = 9 + 2nir , 

which is an infinitely many- valued function since all points for different values 
of n correspond to the same point in the re-plane. 

Example 3. w = cosh 2 

Next let us consider the following functions: 


re = cosh 2 . 


The Cartesian coordinates in the two planes are related as follows: 

u + iv = cosh(a: + iy) = cosh x cos y + i sinh x sin y, 

u = cosh x cos y, v = sinh x sin y. (10.20) 


Dividing the first equation by cosh x, the second by sinh x, squaring and 
adding, we have an ellipse in the re-plane that corresponds to the straight 
line x = const, in the 2 -plane. Similarly, y = const, goes into a hyperbola in 
the re-plane. The equations of the ellipses and hyperbolas are 


cosh 2 : 


sinh 2 ; 


= 1 , 


cos 2 y sin 2 y 


= 1 . 


(10.21) 


The semimajor and semiminor axes of the ellipses are cosh a; and sinh x: the 
semifocal distance is unity. The semiaxes of the hyperbolas are cos y and sin y; 
the semifocal distance is unity. Hence, equations (10.21) represent families of 
confocal ellipses and hyperbolas. This transformation may be regarded as a 
transformation from Cartesian to elliptic coordinates. 


w = 


1 


Example 4 . w = I /2 
Consider the function 


2 


( 10 . 22 ) 
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and use rectangular coordinates to obtain 

(u + iv)(x + iy) = 1. 

By equating real and imaginary parts, we set 

ux — vy = 1, vx + uy = 0. 

By an algebraic elimination first of x and then of y , we arrive at the two 
families of circles: 

m2+ G + 4) = 4^’ + y2 ~ 4^2 ' ( 10 - 23 ) 

The degenerate cases x = 0 and y = 0 cannot be handled by (10.23), but from 
(10.22) we find that respectively, they give the two axes u = 0 and v = 0. 

The transformation is shown in Fig. 10.6. Note that through the transfor- 
mation, the edge of the 2 -plane at infinity (z = oo) is pulled into the origin 
of the tc-plane (w = 0), whereas the center of the 2 -plane is stretched out 
in all directions to infinity in the re-plane. It is possible to visualize this pro- 
cess by introducing an artificial concept, called “the point at infinity”; see 
Sect. 10.1.5 for details. 

Remark. The mapping w = 1/ 2 reverses the orientation of the circumference of 
the circle to be mapped: arg (w) = — arg ( 2 ). For example, the circumference 
of |u>| = 1 is described in the negative since if \z\ = 1 is described in the 
positive sense. 


V 



Fig. 10.6. Mapping w = 1/z 



10.2 Elementary Transformations 321 


10.2.4 Mapping of Finite-Radius Circle 

Remember that the analyticity of functions is characterized by the isotropy 
of their derivatives. Owing to the isotropy, infinitely small circles on the 
2-plane are transformed into infinitely small circles an the tu-plane through 
any analytic function w = f{z). Of course, this shape-preserving behavior dis- 
appears when the circle has a finite radius; because the scale factor h generally 
depends on z. Nevertheless, there exist a class of nontrivial analytic functions 
that transform a finite circle on the 2-plane onto the w-plane, which is simply 
a bilinear transformation. 

4 Theorem: 

Bilinear transformations w = f(z) map circles (or straight lines) on the 
2-plane onto circles (or straight lines) on the re-plane. 


Proof Our proof is based on the fact that the bilinear transformation formula 
(10.11) can be rewritten as 


a B'f-aS 1 

w = f(z ) = - + — v- 

7 7 7 2 + 0 

This is composed of a sequential transformation of the following: 


1 . w = z + b, a simple translation of the plane by the complex vector b. 

2 . w = az, a rotation of the plane through the angle arga, followed by an 
expansion (or contraction) by |o|. 

3. w = I/2, an inversion that takes the interior of the unit circle to the 
exterior and vice versa. 


Since these transformations are all conformal, their composition surely maps 
circles (or straight lines) onto circles (or straight lines). £ 


Remark. Statement 3 above regarding the inversion w = I/2 is followed by 
considering the equation 

a(x 2 + y 2 ) + fix + 72 / + 5 = 0 , 

which represents a circle ( a yf 0) or straight line ( a = 0) in the 2-plane. This 
can be written as 

a|2| 2 + f (2 + 2*) + Z(z -z*) + S = 0. (10.24) 

2 Ai 

Then, the transformation w = 1/z maps it onto 

^M 2 + ^(w + w*) - - W*) + a = 0, 

2 2 1 

which is a circle (6 ^ 0) or a straight line (J = 0). 
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10.2.5 Invariance of the Cross ratio 

The following peculiarity of a Mobius transformation serves as a useful device 
in applications of conformal mapping. 

6 Invariance of the cross ratio: 

Any Mobius transformation w = f(z) that maps the four points Zi 
(i = 1 , 2 , 3, 4) into (i = 1 , 2 , 3, 4), respectively, satishes 

(wi - W 4 )(w 3 - w 2 ) _ (zi - 24) (23 - z 2 ) _ ^ 

(wi - w 2 )(w 3 - Wi) (zi - z 2 ) (z 3 - z 4 ) 

The constant A is called the cross ratio (or anharmonic ratio). 


Proof Let Zi (i = 1,2, 3,4) be four distinct finite points on the 2 -plane and 
let Wi (i = 1,2, 3,4) be their corresponding images through a Mobius trans- 
formation. Then, for any two of the points, we have 


w k ~ m = 


azk + f3 

7Zk + S 


azi + (3 aS — /3j 

7 z i + s {jz k + S) (jZi + S) 


and, consequently, for all four, 

(wi - w 4 )(w 3 - w 2 ) _ (z i - Z4XZ3 - z 2 ) 

(Wi - W 2 )(W 3 - Wi) {zi - z 2 ){zz - Z4)' 


(10.25) 


This clearly ensures the invariance of the cross ratio A under the Mobius 
transformation. X 


Remark. If one of the points of i0j, say w 1, is the point at infinity, the 
corresponding result is obtained by letting w\ — » 00 in (10.25). The left-hand 
side then takes the form 

w 3 - w 2 
w 3 - W4 

This expression is to be regarded as the cross ratio of the points oo,w 2 ,w 3 ,W4. 
A similar remark applies if one of the points Z{ is the point at infinity. 

If 24 is taken to be a variable 2, then the corresponding image w 4 on the 
U'-plane becomes a function of 2 that obeys the relation 

(wi - w){w 3 - W 2 ) = (21 - 2) (23 - 22) nn2fi x 

(wi - w 2 )(w 3 - w) (21 - z 2 )(z 3 - 2) ' 

By solving (10.26) for w, we can verify that it transforms the three points 
21,22,23 into the corresponding points W\ , w 2 , w 3 . In this context, the ex- 
pression (10.26) turns out to show that a Mobius transformation is uniquely 
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determined by three correspondences. Since a circle is uniquely determined by 
three points on its circumference, (10.26) can be used to find Mobius trans- 
formations that map a given circle determined by Zi(i = 1, 2, 3) onto a second 
given circle (or straight line) determined by Wi(i = 1, 2, 3). 

Example If we take z\ = 1, z 2 = i , Z3 = —1 and w 1 = 0, W2 = 1, W3 = 00 , we 
obtain the transformation 

. 1-z 

w = 1 . 

1 + z 

This maps the circle \z\ = 1 on the real axis and the interior \z\ < 1 of the 
unit circle on the upper half of the tu-plane. 


Exercises 


1. Consider the function w = f(z) = (z — zo)/(z — Zq) with Im(zo) ^ 0. Show 
that it maps the region Imz > 0 onto w < 1. 

Solution: Set z = x to obtain 


w[ 




= 1 . 


That is, the image on the x-axis is the circumference of the unit 
circle centered at the origin of the rc-plane. 

Next we evaluate the image of a point off the x-axis in the 
upper half of the z-plane. Expressing z and zq in polar form, we 
have 


2 (re ie - r 0 e i6 °) (re~ ie - r 0 e" ie °) ^ - £ 2 

(■ re ie — r 0 e~ iS °) ( re~ l 9 — roe i6 °) ^1 + £2 ’ 


(10.27) 


where 


£1 = r 2 + Tq — 2 rr 0 cos 8 cos 8 0 and = 2rr 0 sin 6 sin 9 0 - 

Since — 1 < cos0cos0o < 1, we have 

(r — ro) 2 < r 2 + rg — 2rro cos 0cos 0o = £ 1 , i.e. , £1 > 0. 

In addition, since z and z 0 are in the upper half-plane, both sin 9 
and sin 9q are positive, so £2 > 0. Consequently, we have 

M 2 < 1, 

which means that the images of points in the upper half of the 
z-plane are located in the interior of the unit origin-centered 
circle. £ 
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| Remark. If zo were real, all points 2 would be mapped onto the single point 
w = 1, which is the reason we assumed Im(zo) / 0 in the first place. 


which |^o | < 1 maps \z < 1| onto 


|; q | 2 H 2 - 2 2 -(^) 2 + 1 

l*o* "IP 

(i-|*l 2 )(i-M 2 ) 

1*5* - il 2 

Hence, \z\ — 1 corresponds to |w| = 1. In addition, z = z 0 cor- 
responds to w = 0. These mean that \z\ < 1 is transformed onto 
| to | <1. A 

3. Let C and C* be two simple closed contours in the z- and the 10 -plane, 
respectively, and let w = f(z) be analytic within and on C. If to = f(z) 
maps C onto C* in such a way that C* is traversed by w exactly once in 
the positive sense under the condition that z describes C in the positive 
sense, then w = f(z) maps the domain bounded by C onto the domain 
bounded by C*. 

Solution: We denote the domains bounded by C and C* by D 

and D* , respectively. Then it suffices to prove that every point of 
D* is taken exactly once if 2 is in D. Recall that the number n of 
zeros of the function wq — f(z) in D is given by 


2. Show that w = (z — Zo)/(zqZ — 1) in 
|io| < 1 and z = Zo onto w = 0. 
Solution: Observe that 


1 - 


= 1 - 


\Z - Z o|- 
ZnZ - II 


n = 


1 

27 ri 


/'(*) 

c f ( z ) - 


dz. 


With the substitution w = f(z), f'(z)dz = dw , this is rewritten 
as 

1 f dw 
2iri J c . w - Wo' 

where the integration has to be extended over the contour C* into 
which C is transformed by w = f(z). By the residue theorem, the 
value of this expression is 1 if wo is within C* and 0 if wq is outside 
C* . This shows that every point in D* is taken exactly once and 
that a value outside D* is not taken at all. This completes the 
proof. X 

4. Find a conformal mapping w = f(z) of the region between the two circles 
\z\ = 1 and \z — (1/4) | = 1/4 onto an annulus a < \z\ < 1. 

Solution: To solve this, we have to find a bilinear transformation 
that simultaneously maps \z\ < 1 onto \z\ < 1 and \z— (1/4) | < 1/4 
onto a disc of the form \z\ < a. Note that 
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z — a 

w = 

1 - a* z 

maps \z\ < 1 onto \z\ < 1, and that 

Az- 1-0 

s(2) = °1-«4*-1) 

maps \z\ < 1 and \z— (1/4) | < 1/4 onto a disc of the form |^| < a. 
Equating coefficients leads us to a = 2 — v^. 4k 

5. Find the bilinear transformation that maps 0 = 0, i, —1 onto w = 1, —1, 0, 
respectively. 

Solution: Set [z, 0, i, —1] = [w, 1, — 1, 0] to obtain w = —(z + i)/ 

(30 — i). 

6. Show that four distinct arbitrary points on the 0 -plane can be mapped 
through the bilinear transformation onto w = 1, —1, c, — c on the re-plane, 
where c is a complex number depending on the cross ratio A of the map- 
ping. Determine an explicit form of c as a function of A. 

Solution: Let [z ± , Z 2 , Z 3 , Z 4 ] = [1, — 1, c, — c] to obtain c = (1 + 

A ± 2x/2) /(I ^ A) and C 1 C 2 = 1. 4k 

10.3 Applications to Boundary-Value Problems 

10.3.1 Schwarz Christoffel Transformation 

In the preceding section, we discussed rich properties of the bilinear trans- 
formation that can transform the upper half of the 0 -plane onto the unit 
circle of the tc-plane. Now we turn to a similar kind of important mappings 
called the Schwarz— Christoffel transformation (abbreviated SC transfor- 
mation), which transforms the upper (or lower) half of the 0 - plane onto the 
inside of a n-sided polygon drawn on the w-plane. This transformation is 
defined by the following integral: 


n /* z 

w(z) = /3+aJ2 / (z' - x i y 9i/n dz'. (10.28) 

i=l J 

Here Xi (1 < i < n) are n distinct fixed points along the x-axis, and the angle 
0j is defined as shown in Fig. 10.7, being either positive or negative according 
to whether we follow the boundary of the polygon counterclockwise or clock- 
wise. (For example, 9\ and 62 are positive, but 63 is negative in Fig. 10.7.) 
The constant a gives rise to a magnification of that image by a factor |a| 
and a rotation of that image by an angle arg(a). The constant /3 generates a 
translation of the magnified and rotated image. 
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Fig. 10.7. Schwarz-Christoffel transformation of the real axis of the 2 -plane to a 
polygon on the w-plane 


Remark. If we wish to transform the upper half of the 2 -plane into the exterior 
of the polygon in the w-plane, it suffices to define 

w{z) =(3 + a f Z (z> - *i ) ei/ 7 I {z' - x 2 ) e2/ * x N ) 9 N/ *dz', 

where the 0’s are assigned the same values as in the preceding case. 


Example The function 


'(l-£ 2 )(l-& 2 £ 2 ) 


maps the upper half of the 2 -plane (Ini 2 > 0) into the interior of a rectangle 
on the w-plane. In fact, (10.29) is obtained by putting n = 4, 9 t = 7r/2 for 
all i = 1, 2, 3,4 in the definition (10.28), followed by setting x\ = \,x 2 = — 1, 
X 3 = 1/k and X 4 = — 1/k, all of which are located on the real axis. The integral 
in (10.29) is called an elliptic integral of the first kind. 


10.3.2 Derivation of the Schwartz Christoffel Transformation 

In order to derive equation (10.28) for the Schwarz-Christoffel transformation, 
we let 

Xi < x 2 < ■ ■ ■ < x n 

be points on the real axis and consider the function f(z) whose derivative is 
f{z) = a{z - x\)~ kl (2 - x 2 )~ k2 • • • (2 - x n )~ kn . (10.30) 

For this function we have 

arg f(z) = arg a - k\ arg (2 - Xi) - k 2 arg (2 - x 2 ) k n arg (2 - x n ). 

Now, visualize the point 2 as moving from left to right along the real axis, 
starting to the left of the point x\. When z < x 1 , we have 
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arg(> - X\) = arg (z - x 2 ) = ■ ■ ■ = arg (z - x n ) = n, 

whereas for x\ < z < X 2 , arg(z—xi) = 0 , the others remaining at 7 r. Hence, as 
z crosses a\ from left to right, arg f'(z) increases by kin. It remains constant 
for Xi < z < X 2 and increases by k 2 n as 2 crosses x 2 , etc. As a result, the 
image of the segment —00 < z < a\ becomes a straight line, the image of 
x\ < z < x 2 being another whose argument exceeds that of the first by k\n, 
and so on. 

If we constrain the numbers k\,--- ,k„ to lie between —1 and 1, then 
the increments in the argument of f'(z) will lie between — 7r and n. Further, 
for k% < 1 ,k 2 < 1, • “ , k n < 1, it is obvious that the function f(z) whose 
derivative is ( 10 . 30 ) is actually continuous at each of the points x\,x 2 , • • • , x n . 
Therefore, the image of the moving point z will be a polygonal line. Finally, 
integrate ( 10 . 30 ) to set the equation 

f(z)=/3 + aJ (z r - ai)~ kl (z' - a 2 )r k2 ■ ■ ■ (z 1 - a n )~ kn dz', ( 10 . 31 ) 

which maps the x-axis onto a polygonal line. 

Remark. 

1. The sum of the exterior angles of this polygonal line is 

n 

k\n + k 2 n + • • • + k n n = n hi . 

i - 1 

Hence, in order for the polygon to be closed, it is necessary that Ya = 1 
ki = 2. Particularly when ki > 0 for all i, then the polygon becomes 
convex. 

2. The complex constants, a and (3, control the position, size, and orientation 
of the polygon. Thus (3 may be so chosen that one of the vertices of the 
polygon will coincide with some specified point e.g., the origin. Then a 
may be chosen so that one side of the polygon will be of given size and 
parallel to a given direction. 


10.3.3 The Method of Inversion 

The Schwarz-Christoffel transformation itself is applicable to polygons com- 
posed of straight lines, but not to those of circular ones. Nevertheless, combin- 
ing the method of inversion, the former transformation can be extended 
to regions bounded by circular arcs. 
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4 Inversion with respect to a circle: 

An inversion transformation w = f(z) with respect to a circle 
defined by 


w = 



\z\ = a is 


(10.32) 


through which the interior points of the circle are mapped onto exterior 
points, and vice versa. 


The inversion preserves the magnitude of the angle between two intersecting 
curves, but it reverses the sign of the angle. This is attributed to the fact 
that (10.32) consists of two successive transformations: the first a 2 /z, and the 
second a reflection with respect to the real axis. The first of these is conformal, 
whereas the second maintains the angle but reverses its sign. 

For the purpose of this section, we investigate the inversion of a circle of 
radius \z 0 \ centered at z = z 0 ^ 0. This circle is expressed by 

\z-z 0 \ = \z 0 \ (10.33) 

or 

z*z~z 0 (z + z*) =0. (10.34) 

Note that this circle passes through the origin, i.e., the center of an inversion 
circle. Through the inversion (10.32), the circle (10.34) is mapped onto 

a 4 fa 2 a 2 

* z o I 1 * 

ww* \ w w * 





Fig. 10.8. Inversion of the circle \z — zo\ = \zo\ in (10.33) with respect to a circle 
\z\ = a through the mapping w = a 2 / z* given in (10.32) 
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By multiplying ww* on both sides and putting w = u + iv , we have 

a 4 — 2a 2 zqu = 0, 


or equivalently, 

a 2 

u = - — . 

2z 0 

This means that by the inversion, the circle (10.33) is mapped onto a straight 
line parallel to the imaginary axis of the w-plane (see Fig. 10.8). 

The role that inversion plays in extending the Sclrwarz-Christoffel trans- 
formation should now be clear. Assume two interesting circular arcs such as P 
and Q in Fig. 10.9 and a circle R of radius a whose center is the intersection 
of the two circular arcs. Then, by an inversion with respect to R , the point 
at the intersection is transformed into the point at infinity, the arcs them- 
selves being transformed into the solid portions of the lines P' and Q' . As a 
result, the Sclrwarz-Christoffel transformation may now be applied to these 
two straight lines, whereas it may not be applied to the original circular arcs. 

Exercises 

1. Find a transformation that maps the upper half of the 2 -plane onto the 
triangular region shown in Fig. 10.10 in such a way that the points x\ = —1 
and #2 = 1 are mapped onto the points w = —a and w = a , respectively, 
and the point X 3 = ±00 is mapped onto w = ib. 

Solution: Let us denote the angles at w 1 and uq in the tc-plane 

by 4>i = 4>2 — <t>i where <fi = tan -1 (6/a). Since X 3 is taken at 
infinity we may omit the corresponding factor in (10.28) to obtain 

w = (J + a f (£ + l)-^(£-ir^d£ = /3 + a /V-l )-+'*<%. 

Jo Jo 

(10.35) 



Fig. 10.9. Inversion of circular arcs P and Q with respect to the circle R 
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Fig. 10.10. Mapping of the upper half of the z-plane onto a certain limited region 
of the w-plane 

The required transformation may then be found by fixing the 
constants a and (3 as follows. Since the point z = 0 lies on the 
line segment X\X 2 it will be mapped onto the line segment W 1 W 2 
in the w-plane, and by symmetry must be mapped onto the point 
w — 0. Thus setting £ = 0 and w = 0 in (10.35), we obtain (3 = 0. 

An expression for a can be found by considering the region in 
the ru-plane in Fig. 10.10 to be the limiting case of the triangular 
region with the vertex W 3 at infinity. Thus we may use the above, 
but with the angles at w\ and W 2 set to </> = 7t/2. From (10.35), 
we obtain w = a / 0 2 (l/\/£ 2 — 1 )d£ = iasin -1 2 . By setting z = 1 
and w = a, we find ia = 2a / 7r, so the required transformation is 
w = (2a/7r) sin -1 z. ft 

2. Find the conformal mapping that transforms the interior of the circle 
| z\ < 1 to the interior of a polygon on the ru-plane, subject to the condition 
that the points Z\,Z 2 ,-'' i z n lying on the circle |z| = 1 are mapped, 
respectively, onto the vertex w\, u> 2 , • • • ,w n of the polygon. 

Solution: Consider first the transformation 



(10.36) 


which maps \z\ < 1 onto Imr > 0. It yields 
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dr 

dz 


2 

(z - i ) 2 


and t — Tj = 


- z) 

(z — i)(zj -i) 


(i = 1,2,- 


,n). 


(10.37) 

Next, we assume that through (10.36), the points zi,Z 2 , ■ ■ ■ z n are 
mapped, respectively, onto the points - , r n that are lo- 

cated on the line Imr = 0. Then, the transformation that maps 
Imr > 0 onto the interior of a polygon on the tu-plane is given by 
w = w(t), whose derivative reads 


^ = a(r - ri)( fel /’ r )- 1 ( T - T2 )(fe/7r)-i . . . ( T _ Tn )(k„/ir)- 1. 
dr 

(10.38) 

Here k t is the internal angle of the polygon at the ith vertex, which 
satisfies 5Z”=i h = (n = 2)i r. From (10.37) and (10.38), we have 

dw _ 2a 1 (^i ^ ^(fei/^)- 1 . . . (^ ^ 

dz (z — i) 2 2 2 (z — i)~ 2 {z\ — i)( ki / n )~ 1 ■ ■ ■ (z n — i)(W 7r )- 1 


Replace (a/2)(zi — i) 1 ( kl ■ ■ ■ (z n — i ) 1 l fc ' i / 7r ) by a to obtain 
the final result: 


w 


= m = a n Zl - C ) (fel/7r)_1 c)( fc »A)- 1 dC + 

J Zn 


0 , 


where a(y^ 0), f3 are complex constants and Zo ^ z \ , • • • , A 
3. Prove that the function 


= /(*) = r 

Jo 


1 






(10.39) 


maps the unit circle on the z-plane onto a regular hexagon on the ic-plane. 
Solution: Observe that £ 6 — 1 = (£ — £i) • • • (£ — £@) with | \ = 1 

(j = l,-- - ,6). Similarly to Exercise 2 above, we map |^| <1 
onto Imr > 0, and then let the points located on the 

line Imr = 0 correspond to £i, • • • ^6- Then, by setting n = 6 and 
kj = (2/3)7t for all j, we see that the transformation (10.39) maps 
\z\ < 1 onto a regular hexagon on the u>-plane, ( v / 2/6)T(l/3) on 
a side. 

4. Suppose that 4>{z) satisfies the Laplace equation and let w = f(z) be a 
conformal mapping. Then, show that the function 


<j>(w) = v) 


also satisfies Laplace’s equation in the rc-plane; i.e., 
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Solution: Since x = x(u,v), the partial derivative d/dx can be 

rewritten as d/dx = u x (d / du) + v x (d /dv), where u x = du/dx and 
v x = dv/ dx. It yields 


d 2 (j) 

dx 2 


= u 


d 


d 


du dv 


d d , , 

«xw- + V x — (j> 

du dv , 


( d ^ / ^2^ d 2 cj> 

= (»,) g^ + M g^+^-g^' ( ltU1 ) 


Similarly, we have 
d 2 (j) 


~r_, ^2 d 2 /) , , 2 d 2 <f> d 2 </> 

dy 2 ~ (%) du 2 + ( ' v) dv 2+ Uy v dudv 

2 d 2 d> 2 d 2 6 d 2 d> 

= M d^ + M d^~ 2VxUx d^' (10 ’ 42) 


where we have used the Cauchy-Riemann relations: u x = v y , u y = 
— v x . Adding up the sides of the second lines of (10.41) and (10.42), 
we obtain 


d 2 <j) d 2 (j) 

dx 2 dy 2 


(u x ) 2 + ( u y ) 2 


/ d 2 cj) 
\du 2 


d 2 (/> \ 

dv 2 ) 


The quantity inside the square brackets is equal to \f'(z)\ 2 , which 
is nonzero for analytic functions f(z). As a consequence, we con- 
clude that 

d 2 (f> d 2 (f) d 2 (j) d 2 (f) 

ox z dy z du z dv z 


10.4 Applications in Physics and Engineering 

10.4.1 Electric Potential Field in a Complicated Geometry 

The Schwarz-Christoffel transformation is useful in mathematical physics, 
since it can be used to solve two-dimensional Laplace equations under 
certain boundary conditions. In fact, there are many physical systems that are 
described by Laplace’s equation subject to Dirichlet or Neumann bound- 
ary conditions. For example, Laplace’s equation can be used to describe 
heat conduction in a uniform medium, nonturbulent fluid flow, and an elec- 
trostatic field in a uniform system. In this subsection, we demonstrate how 
the Schwarz-Christoffel transformation works efficiently to solve such two- 
dimensional Laplace equations. It should be emphasized that our method is 
independent of the physical system being described. In the meantime, we 
apply the transformation to problems in electrostatics in order to illustrate 
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the method of solution, bearing in mind that that these techniques are also 
applicable to problems involving other physical systems. 

The general procedure for determining the electrostatic potential by us- 
ing conformal mapping methods involves transforming a complicated charge- 
distribution geometry in the 0 -plane into a simple geometry in the w-plane. 
After solving the problem for the simpler geometry, the inverse transformation 
to the 0 -plane is applied to obtain the potential for the original geometry. 

As a concrete example, we consider a metal block with a cut out wedge of 
angle 7 as shown in Fig. 10.11. There is a vacuum inside the wedge. The block 
extends to ±00 in the direction perpendicular to the plane of the page. Since 
charge moves freely inside a metal, all of the charge placed in the conductor is 
distributed in such a way that the potential at all points along these edges is 
the same. We denote this potential by (f> 0 , i.e. , the system under consideration 
is subject to the Dirichlet boundary conditions given by 

(j)(r,9 = 0) = (j){r,e = 7 ) = fa . (10.43) 


0 

Fig. 10.11. Wedge cut of a metal 

Our objective is to evaluate the potential 4>{z) at points in the vacuum 
region inside the wedge (defined by 0 < arg(^) < 7 ). This potential satisfies 
the Laplace equation, and thus, it can be determined by conformal mapping 
methods. For this purpose, we attempt to find the mapping that transforms 
the wedge in the 0 -plane onto the real axis of the w-plane. We know that 
the transformation of the real axis in the 0 -plane onto the wedge shown in 
Fig. 10.11 is given by the Schwarz-Christoffel transformation: 

w = (3 + a(z — Xi)~ 6l//n . 

Therefore, the inverse mapping 



z = x 1 + 




(10.44) 


transforms the wedge in the ic-plane with an internal angle —0± onto the 
real axis of the 0 -plane. By interchanging 0 and w in (10.44), we obtain the 
mapping 
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w = u i 


a 


-■ k /9 i 


(10.45) 


which transforms the wedge in the 2-plane onto the real axis of the re-plane. 

In order to apply this mapping to the configuration shown in Fig. 10.11, we 

set 1 

7 = —9 1, u\ = /3 = 0, and — = 1, 

a 

where a is real. Then, the mapping in (10.45) becomes 


w = 


z «h. 


(10.46) 


Remark. It immediately follows that the mapping in (10.46) transforms the 
space within the wedge onto the upper half of the w plane. This is because 
points within the wedge that satisfy the condition 0 < arg(^) = 9 < 7 are 
mapped onto w = , whose argument ir9/'y takes values in the inter- 

val (0, 7r). 

Remember that the Dirichlet boundary condition is invariant under conformal 
mappings. Hence, the boundary condition of (10.43) is mapped to 

cj)(u, v = 0) = ^0, (10.47) 

where v = 0 is the image of the wedge. As noted earlier, the mapping 
in (10.46) transforms the problem of finding the potential in the region 
within the wedge in Fig. 10.11 to that of finding the potential in the up- 
per half of the w plane due to a flat metal surface that extends along the 
entire it-axis and is maintained at a potential cf> 0 by a uniformly distributed 
charge. 

We now consider the “mapped” Laplace equation for the ui-plane. Since all 
points on the surface of the flat plane are at the same potential, the potential 
all points (it, v) located at the same distance v above the plate is the same. 
Thus, the potential at any point must be independent of the value of u and 
the Laplace equation in the ic-plane becomes 


Integration of this differential equation followed by application of the bound- 
ary condition (10.47) yields 


(f>(v) = 4>o + cv. (10.48) 

The constant c is obtained by using the property that the derivative of the 
potential (i.e. , the electrostatic field) is a constant for a charged flat plate. 
Similar to </> 0, the value of this constant field Eq depends on how much charge 
is distributed over a given area on the plate. With reference to (10.48), 
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- 7 — = c = —Eq, so that d>(v) = 6 o — Eqv. 
ov 

In order to complete the analysis, the potential must be expressed in terms 
of the coordinates in the 2 -plane. From this expression 

v = Im(w) =Im (r*h e ™eh^ = r ”h sin(7T0, /y), 

the potential is given by 

(j> = (j>o — sin (irQ/'y) 

= <j)Q — Eq (x 2 + y 2 )^^ 27 ^ sin — tan -1 f — ) . (10.49) 

|_7 va;/ 

This is the final solution to the problem in question. We see from (10.49) that 
(f> = 9) is constant when 

r 7r ' / ’ 7 sin(7T0/7) = const. 

This is the equation for a family of equipotential curves. 

10.4.2 Joukowsky Airfoil 

Our final discussion related to the applications of conformal mappings con- 
cerns the Joukowsky transformation, which is an important conformal 



Real axis 

Fig. 10.12. The Joukowsky transformation (10.50) of the circle \z — zo\ = |1 — zo\ 
with zo = (—0.2, 0.2) to the airfoil indicated by the thick curve 
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mapping that has been historically employed in the theory of airfoil design. 
Here, the term “airfoil” refers to the cross-sectional shape of a wing (or a pro- 
peller or a turbine). According to the literature on airfoil theory, any object 
with an angle of attack in a moving fluid generates a lift, a force perpendic- 
ular to the flow. Airfoils are designed as efficient shapes that increase the lift 
that the object generates. The Joukowsky transformation maps a circle on 
the complex plane into a family of airfoil shapes called Joukowsky airfoils, 
which simplify the analysis of two-dimensional fluid flows around an airfoil 
with a complicated geometry. 

The Joukowsky transformation w = f(z) is defined by 

w = f(z ) = z -I — , (10.50) 

z 

where z is located on a circle C that passes through the point z = 1 and 
encloses the point z = — 1 as well as the origin z = 0. Note that the center 
of the circle, denoted by zq, does not coincide with the origin, but is located 
close to the origin. In fact, the coordinates of zo are variables, and changes in 
these variables alter the geometry of the resulting airfoil. An example of an 
airfoil generated by the transformation (10.50) is shown in Fig. 10.12, where 
zo = (—0.2, 0.2). We see that the circle C : \z— Z q\ = |1 — zo\ is mapped onto an 
airfoil indicated by a thick curve. The stream lines for a flow around the airfoil 
can be obtained by applying an inverse transformation to the streamlines for 
a flow around the circle and the latter can be easily evaluated. 
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Fourier Series 


Abstract A Fourier series is an expansion of a periodic function in terms of an 
infinite sum of sines and cosines. The use of a Fourier series allows us to break up an 
arbitrary periodic function into a set of simple terms that can be solved individually 
and then recombined in order to obtain the solution to the original problem with 
the desired level of accuracy. In this chapter, we place particular emphasis on the 
mean convergence property of a Fourier series (Sect. 11.2.1) and the conditions 
that are necessary for the series to be uniformly convergent (Sect. 11.3.1). Better 
understanding of convergence properties clarifies the reasons for the utility and the 
limit of validity of Fourier series expansion in mathematical physics. 


11.1 Basic Properties 

11.1.1 Definition 


Fourier series are infinite series consisting of trigonometric functions with a 
particular definition of expansion coefficients. They can be applied to almost 
all periodic functions whether the functions are continuous or not. With these 
expansion, physical phenomena involving some periodicity are reduced to a 
superposition of simple trigonometric functions, which helps us a great deal 
in arithmetic and practical aspects. I section We begin this with a description 
of the basic properties of Fourier series. We follow this by considering the 
convergence theory of Fourier series, which is the issue in the next section. 

First of all, it is important to clarify the distinction between the following 
two concepts: trigonometric series and Fourier series. 


* 


Trigonometric series: 

The series 


A 

2 


OO 

+ ( A cos nx + B n sin nx) 

71=1 


is called a trigonometric series. 
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Here the set of coefficients {A n } and {B n } can be taken arbitrarily. (The 
expression A 0 / 2 instead of A 0 is just due to our convention.) Among the 
infinite choices of {A n } and {!?„}, a specific definition of the coefficients noted 
below provides the Fourier series of a given function f(x). 


6 Fourier series: 



The series 

OO 



— + (a n cos nx + b n sin nx) 

r?.= 1 

(11.1) 

is called a Fourier 

series of a function f(x) if and only if the coefficients 

are given by the Euler— Fourier formula expressed by 



i r 

a n = — / f(x) cos nxdx, 

* J-K 



i r 

b n = — / f(x)smnxdx. 
* J — 7T 

(11.2) 


Accordingly, a Fourier series is a specific kind of trigonometric series whose 
coefficients bear a definite relation (11.2) to some function f(x). In (11.1) we 
have written the constant term as a$/2 rather than a 0 , so that the expression 
for do is given by taking n = 0 in (11.2). There is no bo for sin(0 • x) = 0. 

By definition, every Fourier series is a trigonometric series. However, the 
converse is not true, as demonstrated below. 

Example It is known that the trigonometric series given by 


OO 


E 


sin nx 
log n 


is not a Fourier series. Indeed, no function can be related to the coefficient 
1/logn via (11.2). 


11.1.2 Dirichlet Theorem 

Emphasis should be placed on the fact that the definition of Fourier series 
provides no information as to its convergence; thus the infinite series (11.1) 
may converge or diverge depending on the behavior of the function f{x). 
This leads us to discuss which functions /( x) make the series (11.1) conver- 
gent. This issue is clarified in part by the following theorem (and by referring 
Fig. 11.1): 
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Fig. 11.1. ( a) Continuous and smooth function, (b) Continuous but nonsmooth 
function, (c) Function with a finite number of discontinuities 


6 Dirichlet theorem: 

If fix) is periodic with the period 2n and if fix) is continuous or at 
most have a finite number of discontinuity in [0, 2-7r], then its Fourier series 
converges to 

1. fix), if £ is a point of continuity, or 
f(x + 0) + fix - 0) 


2 . 


if x is a point of discontinuity. 


The set of conditions noted above is called Dirichlet ’s conditions. It is wor- 
thy to note that the Dirichlet conditions are sufficient but not necessary. That 
is, if the conditions are satisfied, the convergence of the series is guaranteed; 
but if they are not satisfied, the series may or may not converge. An exact 
proof of Diriclrlet’s theorem requires rather complicated calculations, which 
will be demonstrated in the next section. 

Remarks. 

1. The Dirichlet conditions do not require the continuity of fix) within 
[0,2tt]. 

2 . Almost all periodic functions that we encounter in physical problems sat- 
isfy the Dirichlet conditions; therefore, the Fourier series expansion can 
be used almost regardless of its convergence. 

It follows that if fix) is continuous within [0,27r] and satifies Dirichlet ’s con- 
ditions, then the Fourier series of fix) converges to fix) at all the points 
within [0, 27 t] . This means that the Fourier series of fix) converges uniformly 
to fix). Once uniform convergence is ensured, we generally write 

OO 

fix) = — + (a„ cos nx + b„ sin nx) 

n—1 


(11.3) 
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with the definition (11.2) for the coefficients. Consequently, if we form the 
Fourier series of f(x) without first examining its convergence to f(x), we 
should write 

OO 

(a n cos nx + b n sin nx) (11.4) 

n—1 

instead of (11.3). The symbol in (11.4) means that the series on the right- 
hand side only corresponds to the function f(x) and can be replaced by the 
equality “=” only if we succeed in proving that the infinite series converges 
uniformly to f(x). 


11.1.3 Fourier Series of Periodic Functions 


Preceding arguments were limited to the case of periodic functions with pe- 
riod 27 t. But Fourier series expansions can apply to periodic functions whose 
periods differ from 2n. This is seen by replacing x in (11.3) by (2tt/\)x, which 
transforms a series convergent in the interval [0, 27r] to another series conver- 
gent to [0, A], The resulting Fourier series is 


OO 

f(x) = — + ( a n cos nkx + b n sin nkx ) , 

n= 1 


(11.5) 


where k = 27r/A and 

2 f x 2 f x 

a n = — / f(x) cos nkxdx and b n = — f(x) sin nkxdx. (11.6) 
A Jo A J Q 

Obviously, these latter expressions can be reduced to the original definitions 
(11.1) and (11.2) by setting A = 27T. 

The expressions (11.5) and (11.6) become more concise by imposing the 
relations 

ginkx _|_ g—inkx ^ inkx g—inkx 

cos(' nkx) = , sin( nkx) = . 

2 2 1 

Then the Fourier series reads 


/(*) = 



g inkx 



a n + ib n 
2 


g — inkx 


(11.7) 


We rewrite the index n in the second sum by —n' to find 
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where the identities a_ n = a n and 6_ n = — b n were used. As a result, we 
obtain a complex form of the Fourier series as 


/(*)-? + £ 
n= 1 
oo 

= E 


a n ib n ^ ^i n kx 


E 


a n \ inkx 


inkx 

L '7i c 5 


n —— oo 

with the definition 


— 


& n ibn 


( 11 . 8 ) 


(11.9) 


An explicit form of c„ is given by substituting the definition of a„ and b n , 
given by (11.6), into (11.9) as 


Cn “ 2 I A 


2 i | 

f(x) cos(nkx)dx — / f(x) sm(nkx)dx 

* Jo 


[ f(x)e~ inkx dx. 

Jo 


( 11 . 10 ) 


11.1.4 Half-range Fourier Series 


Fourier series expansions sometimes involve only sine or cosine terms. This 
actually occurs when the function being expanded is either even [/(— x) = 
f{x)} or odd [f(—x) = —f(x)] over the interval [— A/2, A/2]. When a given 
function is even or odd, unnecessary work in determining Fourier coefficients 
can be avoided. For instance, for an odd function / Q (a :), we have 


2 r x/2 


a " = A 


-A/2 

f rO 


fo(x) cos (nkx)dx 


r \/2 


2 

A 


fo(x) cos(nkx)dx + / fo{x) cos(nkx)dx > 

-A/2 J o J 

/■A/2 /-A/2 ] 

/ f 0 (x) cos(nkx)dx + / f Q (x) cos(nkx)dx 


= 0 (n = 0, 1, 2, ■ • • , ) 


(11.11) 


and 


r A/2 


bn. — 


' —A/2 


/ 0 (x) sin(n/ca;)da: + / ,f 0 {x) sin(nkx)dx 


4 / >A / 2 

— / f 0 (x) sin(nkx)dx (n= 0,1,2,--- ,). (11.12) 

A Jo 
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Here we used the identities cos {—nkx) = cos(nkx ) and sin(— nkx) = — sin (nkx). 
Accordingly, we have 

OO 

foi x) ~ ^2 b n sm(nkx), 

n= 1 

which is called the Fourier sine series. 

Similarly, in the Fourier series corresponding to an even function f e (x), 
the same process yields 

4 r x / 2 

a n = — f e (x) cos(nkx)dx (n = 0,1,2,-) (11.13) 

* Jo 

and b n = 0 for all n. Accordingly, the Fourier series becomes 

OO 

fe{x) ~ — + ^2 a n COS (nkx), 
n— 1 

which is called the Fourier cosine series. 

Note that a n and b n given in (11.12) and (11.13) are computed in the 
interval [0, A/2], whose width is half of the period A. Thus, the Fourier sine or 
cosine series of an odd or even function, respectively, is often called a half- 
range Fourier series. As discussed later, half-range Fourier series expansion 
is important from a practical viewpoint because it enables us to expand a 
nonperiodic function within its domain. 

4 Theorem: 

If f{x) is an even or odd function and it is periodic with period A, then 
the Fourier coefficients a n and b n become 

4 r x / 2 

a n = — / f{x) cos{nkx)dx, b n = 0 if fix) is even 

'' J o 

and 

4 f x/2 

a n = 0, b n = — fix) sin inkx)dx if fix) is odd. 

^ Jo 


11.1.5 Fourier Series of Nonperiodic Functions 

A problem that arises quite often in applications is how to apply a Fourier 
series expansion to a function fix) that is defined only on the interval [0, L\. 
In this case, nothing is said about the periodicity of fix). However, this does 
not prevent us from writing the Fourier series of fix), since the Euler-Fourier 
formulas (11.2) involve only the finite interval. 
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Fig. 11 . 2 . Functions f e (x) and f 0 {x ) defined in (11.14) and (11.15), respectively 


As an example, we try to expand the function 


f(x ) = x for [0, L] 


as a Fourier series. In this case, f{x) is not periodic, but we can make it 
a periodic function by extending it as an even or odd function over [— L,L\ 
and periodic with period 2 L. The respective definitions of f e (x) and f 0 (x) in 
[— L,L\ are 


fe(x) 


— x for — L < x < 0, 
x for 0 < x < L 


(11.14) 


and 

fo(x) = x for — L < x < L, (11.15) 


whose profiles are shown in Fig. 11.2. 

First, we consider the case of the even function f e (x). In terms of the 
Fourier cosine expansion, the coefficients ag and a n are given by 


a n 


2 f L 

— / .fe(x) cos(nkx)dx 

b J Q 


2 L [(-!)" - 1 ] 
7r 2 n 2 


2 f L 

ao = — xdx = L. 

L Jo 


-4 L 

n 2 7r 2 ’ 


n = 1,3,--- 


0, n = 2,4, • • • 


Here we have used kL = i r. Hence, the cosine series becomes 


fix) 


L 4 


7 r 2 2 -— ' (2 n — l) 2 


cos 


(2 n — l)7rx 

L ' 


(11.16) 


The partial sums of the series given in (11.16) are illustrated in Fig. 11.3. 
Although the original function f(x) is defined only within the interval [0, L], 
the resulting Fourier series produces uot only f(x) in [0,L], but also the even 
extension f e {x ) with f e (x) = f e (x + 2 L). 

Second, we look at the sine series of f a given by (11.15). In this case, 

2 2 L (— 1)"+ 1 

b n = — / f 0 (x)sm(nkx)dx = 

L J 0 n n 
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Fig. 11.3. A partial sum on the right-hand side of (11.16) 


and the sine series is 

or .°°. 

/(*) = V „ -sm(nkx). (11.17) 

n = 1 

Figure 11.4 shows some partial sums of (11.17). As in the case of even ex- 
tension, the Fourier series produces the odd extension f 0 (x) with f 0 {x) = 
f e (x + 2 L). 



Fig. 11.4. A partial sum on the right-hand side of (11.17) 


11.1.6 The Rate of Convergence 

We have had two kinds of Fourier series representations for f(x) = x in the 
interval [0, L]. This poses the following question: Does it make any difference 
which kind of Fourier series, (11.16) or (11.17), we use to represent f(x) = x 
in the interval [0, A/2)? Yes, it does. In the above-mentioned case, the even 
extension f e (x) is more suitable than the odd extension / D ( x) for following 
two reasons. 
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The first reason concerns the rate of convergence of the resulting Fourier 
series. The coefficients given in (11.16) go as l/(2n — l) 2 , whereas those in 
(11.17) go as 1/n. Thus, the former series converges more quickly than the 
latter. The difference in the rate of convergence is due to the fact that the 
periodic extension of f e (x) is continuous, but that of f 0 {x) has discontinuities 
at odd multiples of L. In general, the Fourier coefficients of discontinuous 
functions decay as 1/n, whereas those of continuous functions decay at least 
as rapidly as 1/n 2 . These observations as to the rate of convergence of the 
coefficient with respect to n can be formulated as follows: 

4 Theorem: 

If f(x) and its first k derivatives satisfy the Dirichlet conditions on the 
interval [0, A] and if the periodic extensions of f(x),f'(x),--- , f ( - k ~ 1 \x) 
are all continuous, then the Fourier coefficients of f(x) decay at least as 
rapidly as l/n fe+1 . 


The second reason is that the Fourier series representation corresponding to 
the odd extension / G ( x) exhibits a small discrepancy from the original function 
/( x) around points of discontinuity of f 0 (x). This discrepancy is a Gibbs 
phenomenon, illustrated in Sect. 11.3.5. When an extension generates points 
of discontinuities, a Gibbs phenomenon will inevitably occur, which makes 
the resulting Fourier series representation highly unreliable in the vicinity 
of the discontinuity. Consequently, when performing half-range expansions of 
nonperiodic functions, the way of extension that renders the resulting function 
continuous (and smooth) over its domain is preferred. 


11.1.7 Fourier Series in Higher Dimensions 

It is important to generalize the Fourier series to more than one dimen- 
sion. This generalization is especially useful in crystallography and solid-state 
physics, which deal with the three-dimensional periodic structures of atoms 
and molecules. To generalize to N dimensions, we first consider a special 
case in which an iV-dimensional periodic function is a product of N one- 
dimensional periodic functions. That is, we take the N functions f^\x) 
\j = 1, 2, • • • ,N] with period Lj: 

OO 

f U \x)= Y, c«e“/ L C J = 1, 2, • • • ,N,. 


n =— oo 
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Let us define F(r) by the product of all the TV functions 


F(r) = f (1) (x 1 )f (2 \x 2 ) ■ ■ ■ f (N) (x jv) 


= EE-E 


r (!) r (2) . . 


(JV) 2ni(n 1 xi/L 1 -\ \-u n xn /L N ) 

C n N e 


ill n 2 raiv 


= J2 Ckeikr > 

k 


(11.18) 


where we have used the following new notation: 


Ci - cUU 2 ) ... C W 


fc = 2 Tr(n 1 /L 1 ,n 2 /L 2 , - ■ -n N /L N ), 
r = (xi,x 2 , ■ ■ ■ ,x N ). 


We take (11.18) as the definition of the Fourier series for any periodic function 
of N variables. The definition of the coefficient C* can be developed for a 
general periodic function F(r) of N variables: 


^w = E^ e 


ikr 


C k = - F(r)e~ lkr d N x, 
v . V 


(11.19) 


where V = L\L 2 - • • Ljv determines the smallest region of periodicity in N 
dimensions. When TV = 1 (11.19) obviously reduces to the Fourier series in 
one dimension. 


Remark. The application of (11.19) requires some clarification regarding the 
region V of the integral. In one dimension, the shape of the smallest region of 
periodicity is unique, being simply a line segment of length L. In two or more 
dimensions, however, such regions can have a variety of shapes. For instance, 
in two dimensions, they can be rectangles, pentagons, hexagons, and so forth. 
Thus, we let V in (11.18) stand for a primitive cell of the TV-dimensional 
lattice. This cell in three dimensions, which is important in solid-state physics, 
is called the Wigner— Seitz cell. 

Recall that F(r) is a periodic function of r. This means that when r is changed 
by R , where R is a vector describing the boundaries of a cell, then we should 
get the same function: F(r + R) = F(r). This implies that the periodicity of 
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F(r) requires the vector k to take only restricted directions and magnitudes. 
In fact, when replacing r in (11.19) by r + R, we have 

F(r + R) = C k e ik < r+R) = ^ (e ikR ■ C k e ikr ) , 

k k 


which is equal to F(r) if 

e lkR _ i_ e _ f k ■ R = 2n x (integer). (11.20) 

Equation (11.20) is a key relation in determining the allowed directions and 
magnitudes of the vector k. In one-dimensional cases, the inner product re- 
duces to k R = (2Trn/L) ■ L = 2 tt n; thus (11.20) obviously holds true. In 
three dimensions, the vector R is represented as R = miai + 1 x 12(12 + 0 ( 30 , 3 , 
where m 1 , m 2 , and m 3 are integers and a\, 02 , and <13 are crystal axes, 
which are not generally orthogonal. Hence, condition (11.20) is satisfied when 
k = riibi + 7i2&2 + n. 363 , where rq, rq, and n 3 are integers and 61 , b 2 , and b 3 
are the reciprocal lattice vectors defined by 

_ 27r(a 2 x a 3 ) 27r(a 3 xai) 27r(ai x a 2 ) 

1 ai • (a 2 x a 3 ) ’ 2 ai • (a 2 x a 3 ) ’ 3 ai • (cq x a 3 ) ' 

In fact, 

(3 \ / 3 \ 3 

A: R = I riibi ) • m jOj = rijmjbi ■ a,j , 

\*=i / \j'=i / i,j 

and the reader may verify that b, ■ aj = 2 rrSij. Thus we obtain 

3 

k ■ R = 2 t: m j n j = 27r x (integer). 

3 = 1 


Exercises 

1. Expand the following functions in Fourier series: 

(i) f(x) = sinaa: on [— 7r,7r], where a is not an integer. 

(ii) f(x) = sinaa; on [0, 7r], where a is not an integer. 

Solution: It is straightforward to obtain the results: 


(i) sin aa; = — sin a7r ^^(— 1)' 


, n sm nx 
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(ii) sin ax = 


1 — cos an 


cos 2 nx 


a ^ a 2 — 4 n 2 


n— 1 


2a 


1 + cos an ^ cos(2n + 1) 




7r a 2 — (2n + l) 2 

n=0 7 


2. Expand the functions f(x) = cos a; on [0, 7r] in a Fourier sine series. 

8 nsi\\2nx 

Solution: cosa; = — > „ . A 

n ^ 4 n 2 - 1 

n = 1 


3. (i) Find the Fourier series of f{x) = x on the interval [— n, 7r] . 


(ii) Prove that the identity — = 1 1 b 

4 3 5 7 

Solution: 


r\ a \ 2(— l) n+1 . 

(l) j(x) = 2_^ smnt. 


n = 1 


(ii) If we substitute x = n 12 in the series, we obtain 
2 (— 1)” +1 


S-E 


n 


• n7T f 1 1 1 

sm T = 2( 1 -j + --- 


which obviously gives the desired result. X 


4. Expand the function fix ) = x 2 into the Fourier cosine series on the do- 

r t , X ' /» , X > bt( ±f T 

main 


r . , , “ 1 ^2 - 4(_i y 

I— 7r, 7r| and then prove that > — = — and > — 

1 ' n z 6 / 

n=l n = 1 


7T 

y 


Solution: Straightforward calculations yield x 2 = 


4(-ir 


cos na;. 


n— 1 


By substituting x = n and x = 0, we obtain the desired equa- 
tions. x 


5. Determine both the cosine and sine series of /( x) = x 3 — x defined on the 
interval [0, 1]. Which series do you suppose converges more quickly? 
Solution: We may set the even and odd extensions of /( x) over 
[—1,1], respectively, as 
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fo(x ) = x 3 — x for — 1 < x < 1, 


and 


fe{x) 


— x 3 + x for — 1 < x < 0, 
x 3 — x for 0 < x < 1. 


It follows that f 0 (x ) is smoother than f e {x)\ namely, f e (x ) has a 
discontinuity in its derivative at ±n. This implies that the sine se- 
ries converges more rapidly than the cosine series. In fact, straight- 
forward calculations yield the sine series 


fo(x ) 



n— 1 



sin(?X7ra;) 


and the cosine series 


1 9 

/«(*) = "4 + ^2 E 

n— 1 


1 + (— 1)"2 
n 2 


+ “ (- 1 )”] f cos(n7rx). 


The continuity of f 0 (x) and its first derivatives leads to Fourier 
coefficients that decay as 1/n 3 , whereas the continuity of f e (x) 
coupled with the discontinuity in f' e { x) leads to Fourier coeffi- 
cients that decay as 1/n 2 . X 


11.2 Mean Convergence of Fourier Series 


11.2.1 Mean Convergence Property 


We know that Fourier series are endowed with a specific class of convergence 
called mean convergence (or convergence in the mean). This converging 
behavior is expressed by an integral: 


lim 

N—too 



N 


/(*)- E 


C n 6 


inkx 


t,=-N 


2 

dx = 0. 


( 11 . 21 ) 


Equation (11.21) applies regardless of the continuity and smoothness of the 
function f(x), as far as f(x) is square-integrable. 
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Remark. From the viewpoint of Hilbert space theory, the relation (11.21) 
comes from the completeness property of the set of functions {e lnkx } in the 
sense of the norm in the L 2 space. The L 2 space is a specific kind of Hilbert 
space that is composed of a set of square-integrable functions f(x) expressed 
by 


f \f(x)\ 2 dx < oo. 
J a 


The inner product (/, g) and the norm 
tively, are given by 


of elements f,g € L 2 , respec- 


(f,9)= [ f{x)*g{x)dx and ||/|| = (/,/)= f \f{x)\ 2 dx. 
J a J a 


The mean convergence of the Fourier series [i.e., the equality in (11.21)] holds 
even when the integrand in ( 11 . 21 ) has a nonzero value at discrete points of 
x. This comes from the fact that the definition of the mean convergence is 
determined through integration, and that a finite number of discontinuities 
of the integrand do not contribute to the result of its integration. This is 
explained schematically in Fig. 11.5, in which we find 

f(x): a continuous function, 

g^\x) : a series that converges uniformly to f[x) 

except at a point of discontinuity x = a. 

g^ 2 \x) : a series that converges uniformly to f(x) 

except at points of discontinuity x = a±, 02 , 03 , • • ■ . 

As shown in Fig. 11.5, these three functions are distinct from one another. 
However, if we integrate the squared deviation between two of them followed 
by taking the limit n — > 00 , we have 



x ax a x a 2 a 3 x 


(a) (b) (c) 

Fig. 11.5. Sketches of a continuous function /(*), a series of functions g„\x) 
converging to f(x) except at a discontiuity, and a similar series of functions gi?\x) 
having several discontinuities. Series {ffi 1 '*(a;)} and {gn\x)} both converge in the 
mean to f(x) 



11.2 Mean Convergence of Fourier Series 353 


lim f \f{x)- g^\x)\ 2 dx= lim [ \f(x)-g {2 \x)\ 2 dx = 0. (11.22) 

n Jo n ^°° Jo 

This is because the area surrounded by two of them vanishes with n — > oo, i.e. , 
the area right below (or above) points of discontinuity are zero owing to their 
discreteness. Thus, the series gY\x) and gn\x) both converge to f(x) in the 
mean regardless of their discrepancy from f(x) at points of discontinuity. 


11.2.2 Dirichlet and Fejer Integrals 

It is pedagogical to give an alternative exposition of mean convergence of 
Fourier series, which is based on the two important concepts: Dirichlet’s 
integral and Fejer’s integral. 

Consider the partial sum Sn(x) of the Fourier series of f(x) expressed by 

N 

S N (x)= £ c n e inkx 

n=-N 


and its arithmetic mean 


&n{x) — jy ^ (So + <Si + • • • + Sn)- 


(11.23) 


After some algebra, we obtain their integral representations as given below 
(see Exercises 1 and 2 in 11.2.2 for references). 


4 Dirichlet integral: 

fX — X 


i r~ x 

Sn{x) = jJ f{t + x ) 


cos (Nkt) — cos {(N + l)fcf} 
1 — cos (kt) 


dt. (11.24) 


4 Fejer integral: 

a N {x) = 


r \/2 


( N + 1)A J_x / 2 


f(t+xy 


2 jV+1 i 

2 dt. 


sin 2 Afct 


(11.25) 


Remark. Note the distinctive difference between the convergence of Sn and 
that of oat- Whereas limjv^oo Sn = S implies ctjv — > S, the converse does 
not generally hold true; in fact, crjv may converge even when Sn diverges. 
A typical example is the case of the numerical sequence u n = (—1)”, where, 
Sn = Y2 u n does not converge because S 2 N = 0 and S 2 N +1 = 1, whereas 
cr at = Y2 S n /(N + 1) converges to 1/2. 
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The derivation of the identity (11.27) is straightforward. When /( x) = 1, we 
have f(t + x) = 1, Co = 1, and c n = 0 (|?i| > 1), which obviously yield Sn = 1 
for arbitrary N and thus = 1. Substitute this into (11.25) to obtain the 
identity (11.27). Figure 11.6 plots the behavior of Dpj(t) with increasing IV; 
it shows maxima at t = 0, ±A, ±2A, • • • , and the magnitude of the maxima 
become singular when iV — > oo. 

From (11.25) and(11.27), we arrive at the key relation 

1 f x/2 

a N (x)-f(x) = - {f(t + x) - f(x)} D N (t)dt. (11.28) 

A J- A/2 

If f(x) is continuous (piecewise, at least), the integral in (11.28) can be made 
arbitrarily small by taking a sufficiently large N (see Exercise 3 below). To 
be precise, there exists an m for each e > 0 such that 



Fig. 11.6. Dirichlet’s kernel D^it) defined in (11.26) 
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N>m =>■ \cr N (x) — f(x)\ < e. (11.29) 

This clearly means that <jn(x) converges uniformly to fix) if fix) is contin- 
uous. 

As is shown later, the result (11.29) immediately yields the mean conver- 
gence of the Fourier series to f(x). 


11.2.3 Proof of the Mean Convergence of Fourier Series 


We are now in a position to prove the mean convergence property of Fourier 
series. 

The function <7^{x) can be expressed as a trigonometric polynomial, 
since it consists of N's trigonometric polynomials Sq, Si, ■ ■ ■ ,Sn as given by 
(11.23). Hence, (11.29) implies the existence of a trigonometric series that con- 
verges uniformly to f(x). [This is simply Fejer’s theorem (see Sect. 11.3.2).] 
Thus we have 

N 

<7n(x)= J2 lne inkx , 

n— — N 

where all the coefficients {7 „} have to be determined. 

We now make use of the fact that for any choice of {7„}, the inequality 


A 


N 

2 

p x 

N 

fix) - Y e inkx 

dx > 

L 

fix)- Y C nZ inkX 

n=—N 


n=—N 


holds true with the Fourier coefficients {c„} of fix). (See the discussion in 
Sect. 11.2.4 for the proof.) Taking the limit N — > 00 yields 


pX p y 

lim / | f(x) — eqv(£)| 2 dx > lim / 

N^oo J 0 N^oo J 0 


N 


f(x) - 


n=—N 


C n 6 


2kx 


dx. (11.30) 


Let /( x) be continuous (piecewise, at least). Then the left-hand side vanishes 
owing to the uniform convergence of (Jn{x) to f(x) at continuous points x of 
f(x) (A finite number of discontinuous points of f(x) makes no contribution 
to the integral.) Eventually, we come to the desired conclusion: 


lim 

N—too 



N 

fix)- Y, °n einkX 

n=—N 


2 

dx = 0, 


which is a restatement of (11.21). 
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11.2.4 Parseval Identity 


The mean convergence property of Fourier series can be represented by a 
more concise expression, called the Parseval identity. We first note the 
main conclusion of this subsection and then go on to its proof. For simplicity 
of notation, we use the following short form: 


f(x)g*{x)dx= (f,g) . 

4 Parseval identity: 

A necessary and sufficient condition for the mean convergence of the 
Fourier series of f(x) is given by 

OO 

(/>/)= E i c "i 2 ’ 

n= — oo 

which is called the Parseval identity. 


To prove the above statement, we assume f(x) to be square-integrable and 
consider the total squared error of f(x) relative to the series of exponential 
functions: 



N 

fix)- J2 e inkx 

n=—N 


2 

dx, 


(11.31) 


whose variables are N and the sequence {7 n } consisting of complex numbers. 
Term-by-term integration of (11.31) yields 


N 


N 


£ n = (/,/)- E <(f’ einkx )~ E 7 n(f,e ink *y 

n=—N 


n=—N 

N 


E 7„7 n (e inkx ,e imkx ) 


m,n=—N 

N 


N 


i=-N 


= (/, /) - E (7nC„ + 7 n< ) + E 7n7rz 

n=—N n- 

N N 

= (/,/) + E l7n - C„| 2 - E 


|c„| 2 . 


(11.32) 


i=-N 


i=-N 


Here we have used the orthonormality of imaginary exponentials, (e lnkx , 
e imkx ) _ § m nj as we ll as the definitions of the Fourier coefficient c n = 
(/, e lnkx ). Note that (/, /) appearing in (11.32) is nonnegative because 
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\f( x )\ 2 dx > 0. 

Hence, Sn becomes minimal when 7 n = c n and its minimum value reads 

N 

min{£jv} = (/,/) — E l c ™| 2 - (11.33) 

n=—N 


We are now ready to complete our proof. Recall that the mean convergence 
of the Fourier series for f(x ) is defined by 


lim 

N—too 



N 


fix)- E 


c n e 


inkx 


t,=-N 


2 

dx = 0. 


(11.34) 


From (11.31) and (11.33), we see that the definition of the mean convergence 
(11.34) is rewritten as 

lim min{£jv} = 0, (11.35) 

N—>oo 

or equivalently, 

OO 

(/./)= E m 2 - ( 1L36 ) 

n =— oo 

Relation (11.36) is thus a necessary and sufficient condition for satisfying 
the mean convergence of the Fourier series to f(x). Since Parseval’s identity 
applies to any square-integrable function /, Fourier series for the functions / 
surely converge in the mean to f(x). 


11.2.5 Riemann Lebesgue Theorem 

As by-products of the argument in 11.2.4, we obtain the following two impor- 
tant properties regarding the Fourier series expansion. The first is the Bessel 
inequality 

N 

(/,/)> E i c »i 2 - ( n - 37 ) 

n=-N 

This is obtained from the fact that min{£/v} given in (11.33) is nonnegative. 
Here we can let N — > oo in (11.37), because the right-hand side of (11.37) 
forms a monotonically increasing sequence that is bounded by its left-hand 
side. Then we obtain 

OO 

(/./)> E m 2 - ( 1L38 ) 

n— — oo 

We further note that the series on the right-hand side of (11.38) necessarily 
converges, since it is nondecreasing and bounded from above. Consequently, 
we arrive at the second important property to be noted: 

lim c n = 0. (11.39) 

n — »oo 

Separating the real and imaginary parts in (11.39), we eventually find the 
second point to be noted: 
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4 Riemann Lebesgue theorem: 

If f(x) is square-integrable on the interval [0, A] , then 

r x 

lim / f{x) cos(nkx)dx = 0 , lim 

n— >oo Jq n— >oo 



Exercises 


1 . Derive the expressions (11.24) and (11.25) regarding the Dirichlet and 
Fejer integrals, respectively. 

Solution: From the definition of c n , the partial sum Sn(x) yields 
its integral form: 

S N {x)= V \\ f f(t)e~ inkt dt \ ■ e inkx 
n=-N [ XJo J 

J 0 /(*) j E e~ ink{t ~ x) | dt 


1 

A 

1 

A 


fX—x 


N 


f(t+x) y 


—inkt 


dt. (11.40) 


\n=—N 

The finite series of exponential terms reads 


N 

E 

2=-N 


2N 


E- 

n — 0 


g — inkt g—iNkt V ginkt g—iNkt 


1 _ gi(2JV+l)fet 


1 — e 


ikt 


cos(Nkt) — cos{(lV + 1 )kt} 

1 — cos (kt) 

Substituting this in (11.40) yields Dirichlet’s integral (11.24). 
Moreover, its arithmetic mean reduces to Fejer’s integral (11.25) 
as demonstrated by 

gn(x) = y {S'o + S\ + ■ ■ ■ + Sn} 


1 


(N + 1)A 
1 


/ X—x 

f(t + x) 

-X 


1 — cos{(N + 1 )kt} 
1 — cos (kt) 


dt 


f x / 2 sin 2 ^4^-kt 

(N , m / fit + x) . 2 dt. (11.41) 
(TV + 1)A J _ A / 2 sin 5 kt 
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In the last line of (11.41), the interval of the integration from 
[—a;, A — a:] to [— A/2, A/2] is replaced by taking account of the 
periodicity of the integrand, ft 


2. Prove that <tn{x) uniformly converges to f(x) by postulating the conti- 
nuity of f(x). 

Solution: Recall that the continuity of f(x) allows us to deter- 

mine a S that satisfies 


|a: — ar'| < <5 =*- \f(x)-f(x')\<e (11.42) 


for an arbitrary small e to be positive. Further, owing to its con- 
tinuity, the function /( x) is bounded as \f(x)\ < M with an ap- 
propriate constant M. We divide the range of integration given 
in (11.28) as f^^ 2 = f_\/ 2 + f-s + fs^~ anc ^ use the inequality 
(11.42) to obtain the middle term: 


1-5 


{f(t + x) - f(x)}D N (t)dt 


< j \f(t + x) - f(x)\D N (t)dt 


and 


<e J Djsr(t)dt < eA(11.43) 




{f(t + x) - f(x)}D N (t)dt 


< 


< 


s /' A / 2 \ sm 

+ / ){\f{t + x)\ + \f(x)\}-— . 2 

Js J (N + 1) sm ±kt 


2r ^kt 

2 dt 


-A/2 
■ -5 
-A/2 


f-V 2 N 


2 M ■ 


dt 


(IV +1) sin 2 (fc<5/2) 


< 


2MX 


(11.44) 


(N + 1) sin 2 (fc(5/2) ' 

From (11.28), (11.43), and (11.44), we obtain 

WN(x)-f(x) | < e+ . 27TT7W 

(TV + 1) sm (kb/2) 

Taking the limit IV — > oo and fixing the small quantity S, the 
second term vanishes. We thus conclude that 


lim \a N (x) - f(x)\ = 0. 
iV— »00 


(11.45) 
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11.3 Uniform Convergence of Fourier series 


11.3.1 Criterion for Uniform and Pointwise Convergence 


We know that the Fourier series of f(x) converges in the mean to f(x) as far 
as /( x) is square-integrable. However, the mean convergence of the Fourier 
series provides no information as to its uniform convergence. In order for 
the Fourier series to converge (uniformly or pointwise) to the original function 
f(x), several conditions regarding continuity and periodicity of f(x ) have to 
be satisfied. These are formally stated in the following two theorems: 


4 Uniform convergence of Fourier series: 

The Fourier series of a continuous, piecewise smooth, and periodic func- 
tion f(x) converges to f(x) absolutely and uniformly. 

4 Pointwise convergence of Fourier series: 

The Fourier series of a piecewise smooth and periodic function f(x) 
(continuous or discontinuous) converges to: 


(i) f(x) at any point of continuity, and 


(ii) 


f(x + 0) + f(x 
2 


0) 


at any point of discontinuity. 


Our main concern in this section is to prove these two theorems, and we follow 
this by demonstrating several important features of Fourier series that occur 
at discontinuous points of f(x). 

Remark. Observe that the above theorems are consistent with the conclusion 
of the Dirichlet theorem given in Sect. 11.1.2; the latter says that a Fourier 
series representation becomes identical to f(x) provided that f(x) is periodic, 
continuous, and further smooth (piecewise, at least). 


11.3.2 Fejer theorem 

The theorems given in the previous subsection clearly exhibit sufficient con- 
ditions for the Fourier series to converge. It is pedagogical to compare them 
with the Fejer theorem: 

Fejer theorem: 

Any continuous and periodic function f(x) with a period A can be re- 
produced by an infinite trigonometric series 
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N 


lim 

N — >00 

fix)- Y, 1ne inkx 

= 0 for all x, (11.46) 


n=—N 


with an appropriate choice of the set of expansion coefficients {7 „}. 


At first glance, Fejer’s theorem appears to ensure the uniform convergence of 
the Fourier series. However, this is not the case at all; the sequence of the 
optimal coefficients {7„} satisfying (11.46) cannot in general be replaced by 
the Fourier coefficients {c n } defined by 

f(x)e~ inkx dx. 

In fact, even when f{x) is continuous and periodic, its Fourier series may 
diverge at discrete points, as is expressed by 



lim 

N—tOO 


N 

f ( x ) - E c " e * 

n= — N 


= OO 


at some points x. 


(11.47) 


Hence, Fejer’s theorem does not guarantee the uniform convergence of the 
Fourier series representation. Equation (11.47) also suggests that the conti- 
nuity and periodicity of f(x) are only necessary but not sufficient conditions, 
for the uniform convergence of its Fourier series to the original function f{x). 


11.3.3 Proof of Uniform Convergence 


We are now in a position to prove the criterion for uniform convergence of 
the Fourier series c n£ mx to f(x). The proof that is presented below 

is based on the mean convergence property of Fourier series. Recall that the 
mean convergence of Fourier series is expressed by 



/ 0 ) - 


E 


c n e 


2 

dx = 0. 


(11.48) 


In general, the relation 


lim 

N—too 



N 

Y. u ^x) 

n — 0 


2 

dx = 0 


N 

n = 0 


= 0 


means 


lim 

N — >00 
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if and only if the infinite series u n{ x ) converges uniformly to a certain 

function of x within the range of integration [a, b } . Therefore, in order to obtain 
the desired equality 

OO 

/(*)= E °n einkX 

n =— oo 

for any x £ [0, A] , we must seek the condition that the infinite series 
Y. 7,^-00 c n e lnkx converges uniformly to some function of x [not necessarily 
to /(&)]. Thus, we rewrite the Fourier coefficient c n as 


°u A 


[ f{x)e~ inkx dx 

Jo 


— inkX 


i a 1 

+ inkX 


f'(x)e- inkx dx 


= t \ t f\x)e- inkx dx = 

ZTlrif A In ITlrC 


(11.49) 


where c' n is the Fourier coefficient of the derivative f(x). Here f(x) is assumed 
to be periodic, e.g., /( 0) = /(A) and kX = 27r. We further assume that f(x) is 
continuous and smooth (piecewise, at least) on the interval [0, A]. Then, f'{x) 
is continuous (or piecewise continuous) to yield Parseval’s identity: 


/•A oo 

/ \f(x)\ 2 dx= \ C n\ 2 = A , 


where A is a constant. Observe that 


E i | \ ^ C n \ ^ \ c n\ 

ink nk 


(11.50) 


From the Schwartz inequality, it follows that 



It follows that Er=i(l/ n2 ) i s convergent (See the remark below). Hence, 
from (11.50) and (11.51), we see that l c «l converges. This implies 

that oo c n e mkx converges uniformly to a certain function on [0, A] since 

\c n e mkx \ < \c n \ for all n on [0, A]. (See Sect. 3.3.1 for the criteria for uniform 
convergence.) This completes our proof. 
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Remark. That the series Ym = i (1/ rz 2 ) is convergent is verified as follows: set 
A 2 k+ i_i to be a partial sum consisting of the first 2 k+1 — 1 terms. Then we 
have 


1 1 
22 + 32 

1 


A 2 k + 1 — ! - 1 +(iFJ+^) + (^ 


1 

72 


( 2 fc ) 2 


(2 fc +! - l) 2 


< 1+ ^2 x2 +^ x4+ '"+ ( 2 ^ 
-Ef 1 )" _ 1 — ( l / 2) fc+1 c o 


x 2 h 


j = 0 


1 - ( 1 / 2 ) 


This means that A 2 k+i_ 1 for any k is bounded above. Furthermore, the se- 
quence (Am) is monotonically increasing. Hence, (A m ) converges in the limit 
of m — >■ 00 , which completes the proof. 


11.3.4 Pointwise Convergence at Discontinuous Points 

This subsection gives an account of the second criterion in Sect. 11.3.1, which 
is restated below. 


4 Pointwise convergence at discontinuities: 

When a function f(x) is piecewise continuous and piecewise smooth, its 
Fourier series converges pointwise to {f(x + 0) — f(x — 0)}/2 at a point of 
discontinuity. 


This theorem can be proven in the following manner. It readily follows from 
(11.24) that the partial sum of the Fourier series Sn(x) is expressed by 


SN{X) = 2 ^ / f( x + t ^ 


i^N+^jkt _ —i^N+^jkt 


sin 


(\kt) 


-dt. 


We rewrite this as 


f\ — X 


S N (x) = 


e fc 2 


kt 


2i\ 

1 


— x 
pX—x 


f(x + t) r ■ e iNkt dt 
sm 


g-i\kt 


2i\J_ x ^ ^ sin (^kt) 


— iNkt 


dt 


1 r*-x e i^kt 

= yxL {f{x+t)+f{x ~ t)} ^m 


• e 


iNkt 


dt. 
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Here we have set t — > — t in the second integral in the first line. Further, 

f(x + 0)+f(x— 0) 

y\L)V LIL -f 

' —X 


SwW = Sa/„ 9(,) ' 


\—x gi^ktgiNkt 


2iX 


dt , 


sin (gfct) 

(11.52) 


where we have introduced the notation 


g(t) = {f(x + 1) - f[x + 0) + f(x - 1) - f(x - 0)} 


^\kt 


sin (\kt) 

The second term in (11.52) can be simplified via the relation 


1 

iX . 


f*\—x gi^kt giNkt 


sin 


(\kt) 


dt = 1. 


(See Exercise 3 in Sect. 11.3 for its derivation.) Substituting this into (11.52), 
we get 

— [ X ~ X g(t)e iNkt dt + f{X + 0) + f{X ~ 0) . 

2?’A 2 


S N {x) = 


(11.53) 


If the integration term in (11.53) vanishes with N — > oo, we will success- 
fully obtain the desired result. In fact, when g(t) is piecewise continuous in the 
interval [—a:, A — x], the integral in (11.53) vanishes owing to the Riemann- 
Lebesgue theorem (see Sect. 11.2.5). The remaining task is, therefore, to prove 
the piecewise continuity of g(t) on [—a;, A— a;], which is actually verified through 
the following discussion. 

When t / 0, f(t) is piecewise continuous and sin(t/2) and e lt / 2 are 
bounded; thus g(t) is surely piecewise continuous. When t = 0, we have 


g(t) = 


/( x + 1) - /( x + 0) f(x -t)- f(x - 0) 


2 sin {\kt) 


2e^ k \ 


SO 


lim 9 (f) = i lim /(' + *> - f± + °> + lim /(» - 0 - /(» - 0) } . 2 {nM ) 

t ^0 | t >0 ^ t »0 ^ 


The first and second terms in (11.54) are the derivatives of f(x) on the right 
and left, respectively. Since f(t) is assumed to be piecewise smooth, /'(f) is 
piecewise continuous; thus both terms in (11.54) exist. This indicates that 
the limit lim t ^ 0 9{t) exists, so g(t) is piecewise continuous within the interval 
[—a;, A — x\. 

Consequently, we can conclude from (11.53) that 


lim Sn (x) 

N—*oo 


/( x + 0) + f(x - 0) 
2 


which implies the pointwise convergence of the Fourier series to [f{x + 0) + 

/(*- 0)]/2. 
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11.3.5 Gibbs Phenomenon 

If a function /( x) has discontinuities in the defining region, its Fourier series 
does not reproduce the behavior of f{x) at points of discontinuity. In other 
words, the partial sums of a Fourier series cannot approach /( x) uniformly 
in the vicinity of a point of discontinuity. Furthermore, close to discontinu- 
ous points, the Fourier series inevitably overshoots the value of the original 
function to be expanded. The size of the overshoot is proportional to the 
magnitude of the discontinuity. This overshoot is known, which as the Gibbs 
phenomenon is nicely illustrated with the Fourier series for the step function 

{ +1 for 0 < x < — , 

, A 

-1 for — < x < A, 

which is a periodic square wave with period A. The complex Fourier coefficient 
c n reads 


r \/2 


c„ A 


—inkx 


dx — 


—inkx 


dx 


/ A/2 


2imr 


(1 - e " imr ) 2 = 


0 ( n = even) 


- — (n = odd) . 

imr 


Then we have 


/(a 0 ~ — e 

n= — ,-3,-1, 1,3,— mJr 
4 ^ sin(2n — 1 )kx 
7 r 2n — 1 


inkx 


- E (d 

n= 1,3,- 


2 2 

g — inkx _|_ ^ inkx 


(11.55) 


Figure 11.7 shows f(x) for 0 < x < A for the sum of four, six, and ten 
terms of the series. Three features deserve attention. 


(i) There is a steady increase in the accuracy of the representation as the 
number of included terms is increased. 

(ii) All the curves pass through the midpoint of f(x) = 0 at the points of 
discontinuity x = nX/2 (n = 0, ±1, ±2, • • • ). 

(iii) In the vicinity of x = nA/2, there is an overshoot that persists and shows 
no sign of diminishing. 

As more and more terms are taken, the small oscillations along each hori- 
zontal portion get smaller and smaller and, except for the two outer terms of 
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X 

Fig. 11.7. Gibbs phenomena for the Fourier series of a step function. The partial 
sums of one, five, and fifty terms the right-hand side of (11.55) are given 

each portion closes to the discontinuities, eventually disappear. Even in the 
limit of an infinite number of terms, there is still a small overshoot. This over- 
shoot is nothing but what we call the Gibbs phenomenon, which results in 
the fact that the Fourier series cannot have uniform convergence at a point of 
discontinuity. 


11.3.6 Overshoot at a Discontinuous Point 

Owing to Gibbs phenomena, a Fourier series representation is highly unreliable 
in the vicinity of a discontinuity. We now consider the resulting degree of error 
when we represent a function f(x) by a Fourier series having a discontinuity. 

The maximum overshoot can be evaluated analytically through the follow- 
ing procedure. Let us consider a finite sum of the Fourier series in the complex 
form 

N 

Sn(x)= J2 c " einkX ’ 

n=—N 

which yields 

S N (x) — t f f(t + x)K N (t)dt, K N (t ) - Sm ■ ( n - 56 ) 

AT, sm (jKtj 

We consider the behavior of Sn{x) in the vicinity of a discontinuity at x = xq- 
We denote the jump of /( x) at this discontinuity by Af and the jump of its 
finite Fourier sum by ASn ■ 



11.3 Uniform Convergence of Fourier series 367 


A f = / Oo + s) - f{x o - A AS N = S N (x o + e) - S N (x 0 - e), 


where e is infinitesimal. We then have 

^ — x o — £ ^ 

AS n = - f(t + x 0 + e)K N (t)dt-- f(t + x 0 -s)K N (t)dt. 

^ J —xq—£ ^ J — xo+e 

Owing to the periodicity of the integrand f{t + x)Kj^{t), we replace the range 
of integration as follows: 


ASn = \J f{t + x 0 + e)K N {t)dt 


1 

A 


r A+e 


f(t + x o — e)K N {t)dt. 


Hence, we have 


ASn = \ 




f(t + x 0 + e)K N {t)dt 


1 

A 




f(t + x o -e)K N (t)dt 


1 

A 


[f{t + Xq + s) — f(t + X 0 


e)] K N {t)dt 


+ \ J e if( t + x o+ £ ) - f{t + x 0 -e)]K N (t)dt. (11.57) 

The integrand of (11.57) gives zero for all values of t except near t = 0. Close 
to t = 0, the integrand has a somewhat large value because of (i) the jump of 
f(t + x o) at t = 0 and (ii) the significant contribution of K^{t) in the vicinity 
of t = 0. Hence, we can confine the integration to the small interval (—6, +5) 
for which the difference in the square brackets in (11.57) is simply Af. It now 
follows that 


AS n ~ 


Af f S sin {(N + ±)kt} dt 
A J_ s sin \ kt 


AAf f s sin{(AT+ |) kt} 


kt 


dt , 


(11.58) 

where the sine in the dominator was approximated by its argument because 
of the smallness of t. 

The value of ASn depends crucially on the interval S , since the integrand 
in (11.58) rapidly alternates its sign as t increases. The reader may find the 
plot of the integrand in Fig. 11.8, where it is shown clearly that the major 
contribution to the integral comes from the interval [0, X/(2N + 1)], where 
X/(2N + 1) is the first zero of the integrand. Hence, if the upper limit is larger 
than \/(2N + 1), the result of the integral will clearly decrease, because in 
each interval of length A, the area below the horizontal axis is larger than that 
above. Therefore, if we are interested in the maximum overshoot of the finite 
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Fig. 11.8. The integrand of (11.58) 


sum ASn, we must set the upper limit equal to X/(2N + 1). It follows that 
the maximum overshoot is 


(AS N ) max — 


4 Af r x /( 2N + 1 ') sin(N + \)kt 


kt 


dt 


4 Af 1 / ,7r sin x dx 

— (w+ 2 >/„ — WTW 


~ 1.179 Af. 


2 Af r n sin a; 

7T J 0 x 


We thus conclude that the finite (large- N) sum approximation of the discon- 
tinuous function overshoots the function itself at a discontinuity by about 18% 
in this case. This means that the Fourier series tends to overshoot the posi- 
tive corner by some 18% and to undershoot the negative corner by the same 
amount. The inclusion of more terms (increasing r) does nothing to remove 
this overshoot but merely moves it closer to the point of discontinuity. 


Exercises 

1. Let f(x) be absolutely integrable and form the Fourier series of f{x) in 
the interval (— 7T,7r). Show that the convergence of its Fourier series at a 
specified point x within the interval depends only on the behavior of / 
in the immediate vicinity of this point. (This result is referred to as the 

localization theorem.) 

Solution: We use the integral formula for the partial sums 
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S n (x) = - f f(x + u) 


7T 

1 

7T 


— 7 r 
rS 


sm mu 
2sin(u/2) 


du 


, sm mu 

/ t(X + u )-—. — 7 — T^rdu- 

l_ s JK '2sin(tt/2) 


where we have set m = n + (1/2). Here S is an arbitrarily small 
positive number, and I\ . I 2 are the integrals over the intervals 
[<5, 7r] and [— tt, — <5], respectively. On these intervals, the function 
l/[2sin(u/2)] is continuous (since |u| > 5) and, therefore, the func- 
tion 


4>{u) 


/( x + u ) 
2sin(u/2) 


is absolutely integrable. It then follows from the Riemann-Lebesgue 
theorem that the integral 


1 r 

1 1 = — / 4>{u) sin mudu 

n Js 


approaches zero as in — > 00 . The same is true of Ii- Thus, whether 
or not the partial sums of the Fourier series have a limit at the 
point x depends on the behavior of the integral 


1 
7 r 



f{x + u) 


sm mu 
7 r~. — 7 — 7^7 du 
2 sm(it/ 2 ) 


as m — » 00 , which involves only the values of the function f(x) in 
the neighborhood [x — 5, x + 5] of the point x. This completes the 
proof, ft 


2. Let f(x ) = — log |2sin(a:/2)|, which is even and becomes infinite at x = 
2/ctt {k = 0, ±1, ±2, • • • ). 

(i) Show that f(x) is integrable. 

(ii) Calculate the Fourier series of f{x). 

(iii) Derive the identity: log 2 = 1 — (1/2) + (1/3) — (1/4) + • • • . 


Solution: 

(i) The given fix) equals zero at x = 7t/ 3 and is 27r-periodic. 
Hence, to prove the integrability of /( x), it suffices to show 
that it is integrable on the interval [0, 7r/3] . Clearly we have 


f-K/Z 


log 


2 sin 


dx = £ 


log (2 sin ^ + £ 


xcos(a:/2) 
2sin(a;/2) ’ 
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where we have dropped the absolute value sign, since 2 sin(a;/2) > 
lfor0<x<7r/3. As £ — >0, the quantity £log[2sin(e/2)] ap- 
proaches zero, which is verified by using l’Hopital’s rule (see 
Sect. 1.4.1), whereas the last integral converges since the in- 
tegrand is bounded. (Recall that lim x _>o a;/[2sin(a;/2)] = 1.) 
Thus, — /J^ 3 * * * log 1 2 sin 1 1 dx exists, i.e. , f(x) is integrable on 
the interval [0, 7r/3] . 


(ii) Since f(x) is even, we have b n = 0 (n = 1, 2, ■ ■ ■ ) and 


Otn — 


J log ^2 sin cos nxdx (n = 0, 1, 2, • • • ). 


For n / 0, integrating by parts and then applying l’Hopital’s 
rule, we get 


1 f n sinnxcos(x/2) 
mr Jo sin(cc/2) 


dx in = 1, 2, • • • ), 


and then use the identity 2sinnxcos(a;/2) = sin[n + (l/2)]x + 
sin[n — (l/2)]a; to obtain 


1 f sin[n + (1/2)]* 1 Z” 7 sin[n — (l/2)]a; 

mr J 0 2sin(a;/2) mrj 0 2sin(x/2) 

= -• (n = 0, 1, 2, • • ■ ). 

n 

For n = 0, we have 

ao = — — J log ^2 sin '^j dx = ~—J (l°g 2 + log sin ^ dx 
= 7 r log 2 + J log ^sin ^ dx. 

The last integral, denoted by I, reads 


,tt/2 


1 = 2 


log (sin t) dt = 2 


I-k/2 


t t . 

log ( 2 sin - cos - ) dt 


,tt/2 


= 7r log 2 + 2 


log ( sin - ) dt + 2 


(•■k/2 


log ( cos ) dt. 


The substitution t = n — u gives fJJ 1 2 log[cos(t/2)]dt = 
JJ, 2 log[sin(n/2)]du, which implies that / = 7rlog2 + 21, i.e., 
I = — 7rlog2. Consequently, ao = 0. 
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(iii) Since the function f(x) is obviously differentiable for x yf 2 /c7t 
( k = 0, ±1, ±2, • • • ), it follows that 


-log 


2 sin 


= cos x + 


cos 2x cos 3x 


(11.59) 


for x yf 2 /c7t ( k = 0, ±1, ±2, • • • ). Setting x = 7r in (11.59), we 
obtain the desired result. £ 


3. Show that 


f\-x i(N+^)kt- 


-dt = iX. 


J- x sin (\kt) 

Solution: 

Recall an alternative form of Sn{x) given in (11.40): 

r\-x { ^ \ 


(11.60) 


1 


S N (x) = - 


fit + *) [ Y, e ~ inkt dt 


(11.61) 


\n=—N 


Setting /(t) = 1 into (11.61) and (11.52) and comparing them, we 
have 


° + ^. 


fX—x gi^kt giNkt ^ 

-dt = — 
A , 


fX—x 


sin ( | kt) 


/ N 

E 

\n=—N 


—inkt 


dt 


N 


= 1 E 


f\—x 


—inkt 


dt 


1 N 

= t ^2 A(5 n> o = 1. ♦ 


n=—N 


n=—N 


11.4 Applications in Physics and Engineering 

11.4.1 Temperature Variation of the Ground 

The most important applications of Fourier series expansions in the physi- 
cal sciences are in solving partial differential equations that describe a 
wide variety of physical phenomena. In this section, two typical examples of 
such applications are presented, while more rigorous discussions on partial 
differential equations are given in Chap. 17. 

First, we consider the temperature variation of the ground exposed to 
sunlight. The temperature at a depth of x meters at time t , denoted by u(x , t), 
is known to be determined by the diffusion equation 

du d 2 u 

~di =K frc 2' 


(11.62) 
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Here, the proportionality constant k is called the thermal conductivity and 
its magnitude on the ground is roughly estimated at k = 3.0 x 10 -6 m 2 /s. We 
will see below that the Fourier series expansion provides a means of solving 
equation (11.62) and clarifying the physical interpretation of its solution. 

Suppose that the temperature of the land surface, u( x = 0,f), changes 
periodically with a period T; the period T may range from a day to a year. 
It is then reasonable to express u(x, t) by the Fourier series 


OO 

u(x,t) = ^2 c n (x)e muJt 

n =— oo 



Substitute this into (11.62) to obtain 


inu>c n {x ) = k 


d 2 c n 

dx 2 


which implies 


c n (x) oc 



n > 0 

n < 0. 


Here we have chosen the solutions that behave as |c n (a:)| — > 0 in the limit of 
x — » oo. In order to obtain the zeroth term cp{x) 1 we note that 


dc 0 (x) = 
dx 2 


and thus 

cq{x) 

Owing to the condition that lim x . 
Ap = const. As a result, we obtain 


— A 0 + Bpx. 

-^oo |cq (ar) | = 0, we see that Bp = 0 and 


OO 

u(x,t) = A 0 + 2 ^2 A n e~ anX cos {ruvt — a n x + <j > n ) , (11.63) 

n= 1 


where 



and the constants A n and <j> n are determined by the t-dependence of the 
surface temperature u{x = 0 , t). 

Note the presence of the parameter a n in the general solution (11.63). 
It indicates that a wave component with the period T/n has the following 
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features: (i) decay of the wave amplitude by e”"" 1 with an increase in x, and 
(ii) a phase shift by a n x relative to the surface temperature u(x = 0 ,t). 

Let us quantify the actual value of a n . For this, we consider the case of 
T = 1 day (i.e., 60 x 60 x 24 s) and assume monochromatic variation of the 
surface temperature given by 

it(0, t) = 15 + 5 cos °C. 

Comparing this with (11.63) with x = 0, we get A 0 = 15, Ai = 5/2, and 
A n = 0 for n > 2. Then, since 


Oil 


I 2 x 3.14 

2 x (3.0 x 10- 6 ) x (60 x 60 x 24) 


~ 3.5, 


we have 

/27T 

u(x, t) = 15 + 5e~ 3 ' 5x cos f —t — 3.5a: 

A three-dimensional plot of u(x,t) hr the x-t plane is shown in Fig. 11.9. We 
observe that at depths greater than 1 m, the temperature variation is almost 
in antiphase to that at the surface ( x = 0) and the amplitude decreases 
considerably. 


11.4.2 String Vibration Under Impact 


The second example is the vibration of an elastic string subject to an impact 
force in a local region. Consider the case of a piano wire under an impact force 
applied by a hammer. Suppose that an impulse / is applied at the position 
x = a of a suspended string with length £ and mass density p. The vibrational 
amplitude of the string, denoted by u(x, t), is governed by the wave equation 


d 2 u 2 <9 2 u 
dt 2 dx 2 

The string is initially assumed to be stationary, i.e., 


(11.64) 


u(x,t = 0) = 0. (11.65) 

The initial velocity of the line element at x is denoted by v(x). Then, the law 

of the conservation of momentum states that 



pv(x)dx 


= 1 , 


(11.66) 


where 


v(x) = V ■ S(x — a) 


(11.67) 
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Date t 


Fig. 11.9. Temperature variation u(x,t) of the underground below x meters on t 
days 


with an appropriate constant V. From (11.66) and (11.67), we have V = I / p. 
Furthermore, since 


v(x) 


du 


dt 


i= 0 


— S(x — a). 
P 


we have 


du 

dt 


i= 0 


(11.68) 
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Under the two initial conditions (11.65) and (11.68), the general solution of 
(11.64) is given by 


OO 

u(x, t) = A n sin k n x sin (ui n t + <f > n ) , (11.69) 

n — 1 


where 

h — — u> — ck 

rxyn — ^ 5 

The constants A n and <f> n in (11.69) are again determined by the initial 
conditions. First, imposing the condition u(x,t = 0) = 0 into (11.69) implies 


A n sin <j> n — 0 for all n, 


(11.70) 


owing to the linear independence of {sinfc„x}. Next, it follows from (11.69) 
that 


du 

at 


= Y A n u>„ cos 4> n sin k n x = —8{x — a). 
t=o ^ P 


Mutiplying both sides by sin k m x and then integrating yields 
r e i 

A m uJ m cos (j> n / sin 2 k m xdx = - sin k m a for all m. (11.71) 

Jo P 


From (11.70) and (11.71), we finally obtain 


= 0 and A n = 


2 1 

piu>n 


sin La for all n. 


(11.72) 


The second expression in (11.72) implies that the position x = a that sat- 
isfies sin k n a = 0 yields A n = 0; i.e., the nth vibration mode is not excited by 
the impulsive force applied at x = a that satisfies sin k n a = 0. In contrast, if 
we apply an impulsive force at x = a satisfying | sin k n a\ = 1, the correspond- 
ing ?xth mode will have a large vibrational amplitude, as is actually the case 
inside a piano. 
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Fourier Transformation 


Abstract Fourier transformation is an effective tool for confirming the dual na- 
ture of a complex- valued function (as well as a real- valued one). Furthermore, the 
transformation enables us to measure certain correlations of a function with itself 
or with other functions; thus a Fourier transform can be applied to probability the- 
ory, signal analysis, etc. In this chapter we also provide the essence of a discrete 
Fourier transform (Sect. 12.3), which refers to a Fourier transform applied to a dis- 
crete complex- valued series. A discrete Fourier transform is commonly used in the 
numerical computation of Fourier transforms because of its computational efficiency. 


12.1 Fourier Transform 

12.1.1 Derivation of Fourier Transform 

The properties of Fourier series that we have already developed are adequate 
for handling the expansion of any periodic function. Nevertheless, there are 
many problems in physics and engineering that do not involve periodic func- 
tions, so it is important to generalize Fourier series to include nonperiodic 
functions. A nonperiodic function can be considered as a limit of a given 
periodic function whose period becomes infinite. 

Let us write Fourier series representing a periodic function f{x) in complex 
form: 

/(*)= E c « eifcx ’ t 12 - 1 ) 

n =— oo 

with the definition k = 2mr/X , in which 
We then introduce the quantity 

. , 27T 

Ak = —An. 

A 


(12.2) 
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From the definition (12.2), the adjacent values of k are obtained by setting 
An = 1, which corresponds to (A/2-7T )Ak = 1. Therefore, multiplying each 
side of (12.1) by (A/27r)Z\fc yields 

OO 

f(x)= Y, ca (k)e ikx Ak, (12.3) 

n =— oo 

where 

X 

c\(k) = —c n = [ f(x)e~ lkx dx. 

Ztt Zn lx 
2 

In the limit as A — > oo, the ks are distributed continuously instead of discretely, 
i.e., Ak — > dk. Thus, the sum in (12.3) becomes exactly the definition of an 
integral. As a result, we arrive at the conclusion 


1 f'°° 

c(k ) = lim C\(k) = — / f(x)e~ lkx dx 
A^oo 27T J_ 00 

(12.4) 

pOO 

f{x)= / c(k)e ikx dk. 

(12.5) 


— OO 


Further, by defining F(k) = \JZk c(fc), equations (12.4) and (12.5) take the 
symmetrical form given below, known as the Fourier transform or Fourier 
integral representation of /( x). 

A Fourier transform: 

The Fourier transform of f(x) is defined by 

F(k) f(x)e~ lkx dx. (12.6) 

6 Inverse Fourier transform: 

The inverse Fourier transform of F(k) given above is defined by 

1 r°° 

/(*) = -^y_ F(k)e ikx dk, (12.7) 


We often write the expressions (12.6) and (12.7) in simpler form: 

F(k)=F[f(x)\ and f(x) = T~ l [F(k)] . 

Observe that F(k) as well as f(x) are, in general, complex-valued functions 
of the real variables k and x, respectively. Yet, if f(x) is real, then 

F(-k)=F*(k), 

which gives two immediate corollaries (proofs are left to the reader): 
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4 Fourier integral theorem: 

1. If f(x) is real and even, F(k) is real. 

2. If f{x) is real and odd, F{k) is purely imaginary. 


12.1.2 Fourier Integral Theorem 


Our derivations of the Fourier transform and its inverse transform, (12.7) and 

(12.6) , have been ambiguous from a mathematical viewpoint. For developing 
exact derivations and clarifying the conditions for the infinite integrals in 

(12.7) and (12.6) to converge, the following theorem is of crucial importance: 


6 Fourier integral theorem: 

If f{x) is piecewise smooth and absolutely integrable, then 


1 

7T 



fit) COS uix 



du = 


fjx + 0) + fjx 
2 


0) 


(12.8) 


I Remark. The theorem is valid for each fixed x, so x can be considered a 
constant insofar as the integrations are concerned. 


Before starting the proof of the theorem, we note that (12.8) reduces to the 
form of (12.7) and (12.6) when a: is a continuous point of fix). To see this, 
we make use of the identity 


cos uix — t)du = - 


g iu(x-t ) ( j. 


u. 


-€ 


(12.9) 


Since (12.8) reads 


fix)= lim — [ f{t)dt f cos uix — u)du, (12.10) 

4^00 7T J J Q 


we substitute (12.9) and (12.10) to obtain 


-I /* OO /»00 -I /» oo 

fix) = — / e iux du / fit)e~ itu dt = — / Fiu)e iux du, 

J — oo J — oo J — oo 


where 

1 r°° 

F i u ) = - jjity-^dt. 

These results are clearly equivalent to the forms of (12.7) and (12.6). 
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12.1.3 Proof of the Fourier Integral Theorem 

The proof of the Fourier integral theorem is based on the following two 
lemmas: 


4 Lemma 1: If f(x) is piecewise smooth for all x £ R, then 

lim [ f(x) S ^ n ^ X dx = ~ / (0+) for b > 0. 
o ' x 2' 

4 Lemma 2: If f(x,t) is a continuous function of t for a < t < b and 

if Hindoo f Q f(x, t)dx exists and converges uniformly to a certain function 
g{t) in the interval, then g{i) is continuous in the interval and 


r b 

r b 

■ /»oo 

/»oo 

f b 

/ g{t)dt= / 

/ f(x,t)dx 

dt = 1 

/ f(x,t)dt 

fa 

J a \ 

J 0 

Jo 

J a 


Note that /(0+) in Lemma 1 denotes the limiting value of f(x) as x tends 
to zero through positive values. The proof of Lemma 1 is left to Exercise 2. 
Lemma 2 follows from the fact that uniform convergence allows us to inter- 
change the order of limiting and integration procedures (see Chapter 3 for 
details). 

We are now ready to prove the Fourier integral theorem expressed by (12.8) . 


Proof (of the Fourier integral theorem): Let f(x) be piecewise 
smooth and absolutely integrable. Consider the integral 


f(t) cosu(x 


t)dt. 


Since | cos u(x — t) \ < 1, the convergence of this integral is ensured by 
our hypothesis that \f(t)\dt converges, and since this conclusion is 
independent of u and x, the convergence is uniform for all u. Therefore, 
in view of Lemma 2, we can interchange the order of integration in 


I = 



f(t) cos u(x 



du 


to obtain 


1 = 


/ oo pb 

-oo J 0 


f(t) cos u(x — t)du 


dt = 


sin b(x — t) 
x — t 




We now decompose this into four integrals: 


I = 




sin b(x 




( 12 . 11 ) 


x — t 
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where M is taken to be so large that the first and the last integrals 
in (12.11) are less in absolute value than some prescribed e > 0. By 
changing variables, taking u = t — x, we can write the third integral 
in (12.11) as 


rM—a 


sin bu 


f(x + u)du. 


In view of Lemma 1, this tends to 7r f(x + 0)/2 as b — > oo. Similarly, 
the second integral tends to tt f{x — 0)/2. Therefore, by taking M 
sufficiently large, we obtain 


lim / < 

b—*oo 


TT [f( X + 0) + f{x ~ 0)] | 


or equivalently, 

^»oo 

f(t) cosu(x — t)dt 


du _U/U + o) + /(*- 0)] <2e- 


This completes the proof of the theorem. £ 


12.1.4 Inverse Relations of the Half-width 


In practice, we often encounter functions f{x) having a sharp peak at a specific 
point, say x = 0. The width of the peak of such a function is possibly correlated 
with the width of the peak that is exhibited by the resulting Fourier transform 
F(k) = F[f(x)]. A typical example of this phenomenon is seen by considering 
the Fourier transform of a Gaussian function /( x) = ae~ bx with a, b > 0, i.e. , 


m = 


V2^J- 


j: 


—bx z „—ikx 


dx = 


ae 


- k 2 /(4b) rco 
\i 27T j —oo 


l - b [ x + ik /( 2b )} 2 d x _ 


We substitute y = x + ik/(2b) to evaluate the integral as 



e -b[x+ik/{2b)f d x 




and we get 


m 


a -fc 2 /( 4b) 

V2b 


which is also Gaussian. It is noteworthy that the width of f(x), which is 
proportional to 1 /s/b, is in inverse relation to the width of F(k ), which is pro- 
portional to Vb. Therefore, increasing the width of f(x) results in a decrease 
in the width of F(k). In the limit of infinite width (a constant function), we 
get infinite sharpness (the delta function). In fact, denoting the widths as Ax 
and Ak, we have AxAk ~ 1. 
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4 Inverse relation of the half-width: 

When f(x) consists of a single peak whose width is characterized by Ax, 
its Fourier transform F(k) is also a single-peak function with a width Ak, 
which yields AxAk ~ 1. 


For the second example, we evaluate the Fourier transform of a box function 
defined by 


/ 0*0 


b , if \x\ < a, 
0, if jccj > a. 


From the definition, we have 


m 




f{x)e~ lkx dx = 


b 

\/27T 



e~ ikx dx = 



sin ka 
ka 


Observe again that the width of f(x), Ax = 2 a, is in inverse relation to the 
width of F(k), which is roughly the distance between its first two roots, k+ 
and fc_, on either side of k = 0: Ak = k + — k_ = 2ir /a. In addition, if a — > oo, 
the function f(x) becomes a constant function over the entire real line, and 
we get 


m 



sin ka 
k 


2b 

^2-k 


7 rS(k). 


Otherwise, if b — > oo and a — > 0 in such a way that 2 ab [the area under the 
graph of f(x)} remains fixed at unity, then f(x) approaches the delta function 
and F(k) becomes 


F(k) = lim lim 

a — ^0 b — ►oo 


2 ab sin ka 
\[2 /k ka 



12.1.5 Parseval Identity for Fourier Transforms 


If F(k) and G(k) are Fourier transforms of f{x) and g(x), respectively, we 
have 


/ OO 

f(x)g* (x) dx 

-OO 

r°° r 

J- oo l V27T 

/ oo roo r 1 noo 'j 

dkj dk'F{k)G*{k) J e- l{k - k ' )x dx\ 


m 


e~ ikx dk \ x 


~^= I G*(k')e ik ' x dk' \dx 
V2n J-oo 


/ oo /»oo 

dkF(k) / dk'G*{k')5{k' -k) 

-OO J — OO 


— oo 


(12.12) 
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or similarly, 

/ OO 1 pOO 

f(x)g(x)dx=—— / F(k)G(—k)dk. (12.13) 

-oo 2 7T 7-00 

In particular, if we set g(x) = f(x) in (12.12), we have 

/ OO -1 p OO 

Jf( x )fdx=- J jF{k)\ 2 dk, (12.14) 

Here |F(fc)| 2 is referred to as the power spectrum of the function /( x). 
Equation (12.14), or the more general (12.13), is known as the Parseval 
identity for Fourier integrals. 

Remark. A sufficient condition for interchanging the order of integration in 
(12.12) is the absolute convergence of the integrals: f_ F(k)e~' lkx dk and 

fZo G(k')e~ ik ' x dk' . 

Parseval’s identity is very useful for understanding the physical interpretation 
of the transform function F(k) when the physical significance of f{x) is known, 
as illustrated in the following example: 

Examples The displacement of a damped harmonic oscillator as a function 
of time is given by 


m = 


0 for t < 0, 

e _t / T sinw 0 i for t > 0. 


The Fourier transform of this function is given by 

/ 0 poo 

0 xe~ lut dt+ / e -t / r sin uj 0 te~ lult dt 

-oo J 0 


= 0 


2 i 


e — i(u— u>o)t— t/r _ e -i(ui+uJo)t—t/T 


dt 


1 


1 


1 


2 \lo + ojq — i / r lo — loq — i/r 


The physical interpretation of l-F^w)! 2 is the energy content per unit frequency 
interval (i.e. , the energy spectrum) while |/(t)| 2 is proportional to the sum of 
the kinetic and potential energies of the oscillator. Hence, Parseval’s identity, 
expressed by 

/ OO -| poo 

jm \ 2 dt=^ j \F{w)\ 2 d», 

shows the equivalence of these two alternative specifications for the total en- 
ergy to within a constant. 
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12.1.6 Fourier Transforms in Higher Dimensions 

The concept of the Fourier transform can be extended naturally to more than 
one dimension. For example, in three dimensions we can define the Fourier 
transform of f(x, y, z ) as 

F(k x ,k y ,k z ) = — p; ^ JJJ f(x,y,z)e- ik * x e- ik *ye- ik * z dxdydz (12.15) 
and its inverse as 

f(x,y,z) = HI F(k x ,k y k z )e ik * x e ik yy e ik ‘ z dk x dk y dk z . (12.16) 

Denoting the vector with components k x , k y , k z by k and that with compo- 
nents x, y, z by r, we can write the Fourier transform pair (12.15), (12.16) as 
follows: 


4 Fourier transforms in three dimensions: 

F{k) = ( 2 ^ 7 * / f^ e ~ ik rdr ’ 

/(r)= (2 ^f F ^ elkrdk - 


It is pedagogical to evaluate the Fourier transform of a function f(r) under the 
condition that the system possesses spherical symmetry, i.e., /(r) = /(r). We 
employ spherical coordinates in which the vector k of the Fourier transform 
lies along the polar axis (9 = 0). We then have 

dr = r 2 sin 6drd9d<j) and k ■ r = kr cos 6, 

where k = |fe|. The Fourier transform is then given by 

F(fc > = (2^/ Kr)e-‘ k "dr 

-| /*00 /*7T 

= 7 N1/Q / drr*f(r) / dd sin0e- lkrcos9 . 

(2tt)i/2 7 0 

The integral over 6 may be straightforwardly evaluated by noting that 
± e -ikrcos9 = ikr sin d e~ ikrcos 9 . 
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Therefore, 


m = 


(2^)V2 


2 f(r)dr 


r.—ikr cos 0 


. 0=71 


ikr 


l e=o 



dr. 


Remark. A similar result may be obtained for two-dimensional Fourier 
transforms in which /(r) = /(p), i.e., f(r) is independent of the azimuthal 
angle (j>. In this case, we find 


F(k) = / pf(p)J 0 (kp)dp , 


where Jq{x) is the zeroth order Bessel function. 


Exercises 

1. Show that if f(x) is piecewise continuous over (a, 6), then 

fb 

lim / f(x) sin^xdx = 0. 


Solution: If / has a continuous derivative, this is easily proved; 

we integrate by parts to obtain 


f(x) cos £xdx — 


f(x) 


sin£xl 


f'(x) sin £xdx, 


which tends to zero as £ — > oo since the integral on the right-hand 
side is bounded. If / is not integrable, let p be a continuously 
differentiable function such that f a | f(x) — p(x)\dx < e. Their 


[f{x) — p(x)) cos £xdx 


< / \f( x ) — p( x )\ I cos£x\dx 


< f \f(x) -p(x)\dx <s 


independently of £, and as the preceding discussion gave us 
p(x) cos £xdx —> 0, it follows that / Q b f(x) cos ^xdx — > 0 as well. 
The proof that f p(x) sin £xdx — > 0 is similar. Jf» 
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2 . Show that 


Sin X 7T 

ax = — . 

x 2 


Solution: If we substitute A = 27 T, x = n into (11.60) and note 

that the integrand is an odd function, it follows that 


r sin(^tl u) 
/o sin(u/2) 


du = 


(12.17) 


Applying the result of Exercise 1 noted above to the function 
[(2/m) — 1/ sin(u/2)] (which is bounded in 0 < u < 7 r), we have 


lim / sin 

n—*oo lo 


2n T 1 


1 


u I < 


u sui(m/2) 


du = 0. 


(12.18) 


Summing (12.17) and (12.18), we obtain 


lim 

n—> oo 


2s in 2n±i 


- — —du = 


Changing variables and letting t = (2 n + 1 ) m/ 2, we set 

,(2n+iW2 si , ^ 

lim / dt = 

ra— too / n t 2 


We already know that (sin t)/t dt tends to a limit as M — ^ oo 
which completes our proof. £ 


3 . Show that 

lim / f(x) ^ = ^y(Q+) forb > 0 

T— too J q X 2 

whenever / is piecewise smooth. 


Solution: Observe that 


fb sin Ax f b sin Ax f b f(x) - f( 0+) 

/(x) dx= / /(0+) dx+ / ^ 1 


sin Axdx 


= /( 0 +) 




From the result of Exercise 1, the last integral tends to zero as 
A oo, since the integrand is piecewise smooth in the interval 
0 < x < b. It also remains bounded in this interval since, as x 
tends to zero, [/(x) — /(0+)]/x tends to /'(0+). From Exercise 2 , 
the other integral tends to the desired value. £ 
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12.2 Convolution and Correlations 

12.2.1 Convolution Theorem 

In the application of the Fourier transform, we often encounter a product 
such as F(k)G(k), where each of two functions is the Fourier transform of a 
function f{x) and g(x), respectively. Here, we are interested in finding out 
how the inverse Fourier transform of the product denoted by 

[F(k)G(k )} , 

is related to the individual inverse function 

F~ 1 [F(k)) = f(x) and ^ r_1 [G f (fc)] = g(x). 

To begin with, we introduce a key concept called convolution and then state 
an important theorem that plays a central role in the discussion of the matter. 


4 Convolution: 

The convolution of the function f(x) and g(x), denoted by f * g, is 
defined by 

J,a =7S 



The convolution obeys the commutative, associative, and distributive 

laws of algebra, i.e., if we have function /i, /2, /3, then 

/i * /2 = /2 * fi (Commutative) . 

/i * (/2 * /3) = (/i * /2) * /3 (Associative). (12.20) 

fi * (/2 + /3) = (/i * /2) + (/i * /3) (Distributive). 

We are now ready to prove the following important theorem regarding the 
product F(k)G(k) of two Fourier transforms. 


6 Convolution theorem: 


If F(k) and G(k) are Fourier transforms of f(x ) and g(x ), respectively, 

then 


F(k)G(k) = F[f * g]. 

(12.21) 
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Proof It follows from the definition of the Fourier transform that 

1 r°° 

F(k) = -= / f(x)e~ ikx dx, 

v J — oo 

G(k) = -= / g(x)e- ik *dx, 

V J — oo 

which yields 

1 /*oo /»oo 

F{k)G(k) = — / / f(x)g(x')e- lk{x+x ' ) dxdx l . (12.22) 

Let x+a/ = u in the double integral of (12.22) transform independent variables 
from (x,x r ) to (x,u). We thus have 

dxdx' = — \dudx , 

o(:r, u) 

where the Jacobian of the transformation is 


dx 

dx 


i 

0 

dx 

du 


dx' 

dx 

dx' 

du 


0 

1 


Then (12.22) becomes 

-| f oo /»oo 

F(k)G(k) = — / / f{x)g(u — x)e~ lku dxdu 

2tt J.oo J.oo 


£ e "‘“ { £ £ /(i)9(u - x)dx \ du 


= F[f * g}- * 


(12.23) 


12.2.2 Cross-Correlation Functions 

There are several important functions related to the convolution, which are 
called correlation functions (see below) and auto-correlation functions 

(see Sect. 12.2.3). 

4 Cross-correlation function: 

The cross-correlation of two functions / and g is defined by 


c (z) = -j= I f*{x)g(x + z)dx. (12.24) 

V All J — oo 
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Despite the apparent similarity between the cross-correlation function (12.24) 
and the definition of convolution (12.19), their uses and interpretations are 
very different: the cross-correlation provides a quantitative measure of the 
similarity of two functions / and g since one is displaced through a distance 
2 relative to the other. 

I Remark. Similar to the convolution, the cross-correlation is both associative 
and distributive. Unlike the convolution, however, it is not commutative. 

We arrive at an important theorem by considering the Fourier transform of 
(12.24): 

Wiener Kinchin theorem: 

The Fourier transform of the cross-correlation of / and g is equal to the 
product of F*(k) and G(k) multiplied by y/2n, i.e., 

F[c(z)\ = C(k) = F*(k)G(k). (12.25) 


Proof 

1 r°° ( 1 f°° 

F[c{x)\ = C{k) = -^== J dze~ lkz j -^== J f* ( x)g{z + x)d. 

= wJl^ rix) {wJl siz+x)e ~‘" d ‘}- 

Making the substitution u = z + x in the second integral, we obtain 

i r , . r i 


C(k) = 


J - c 

1 ! 


dxf*{x ) 


V^TT J-c 


g{u)e~ ik ^ u - x) du 


\/27 T J —c 


f*{x)e lkx dx 


1 


V2tt J-c 


g{u)e~ lku du 


(12.26) 


= F*{k)G{k). X 

It readily follows from the definition (12.24) and the theorem (12.25) that 
1 


=(*) = 


C(k)e lkz dx = / F* (k)G(k)e tkz dk. 


V2tt J- 0 

Then, setting 2 = 0 gives us the multiplication theorem 


(12.27) 


/ OO fOO 

f*{x)g{x)dx= / F*(k)G(k)dk. (12.28) 

-OO J — OO 
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Further, by letting g = /, we arrive at the following identity: 

6 Plancherel identity: 

A function f(x) and its Fourier transform F(k) are related to one another 
by the identity 

n OO nOO 

/ \f{x)\ 2 dx = / \F(k)\ 2 dk, (12.29) 

J — oo J —oo 

which is called the Plancherel identity. 

Plancherel’s identity is sometimes called Parseval’s identity, aims to the anal- 
ogy with Fourier series. 

12.2.3 Autocorrelation Functions 

Particularly when g{x) = f(x), the cross-correlation function c(z) is referred 
to specifically as follows: 

Autocorrelation function: 

The autocorrelation function of f{x) is defined by 

ct ( z ) = ]— 



Using the Wiener-Kinchin theorem (12.26), we see that 

1 r°° 1 r°° 

a(z) = -= \ A(k)e ikx dk = -= V2nF*(k)F(k)e ikx dk 

V^J-oo 7-00 

1 f 00 

= -= / \F(k)\ 2 e ikx dk. 

This implies that the quantity |F(fc)| 2 , called the power spectrum of f(x), 
is the Fourier transform of the autocorrelation function as formally stated 
below. 


4k Power spectrum: 

Given f(x), we have 

i r°° 

\m\ 2 = -j=] ^a(z)e~ ikx dx, 

where F(k) and a(z) are, respectively, the Fourier transform and the auto- 
correlation function of f(x). 
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This result is frequently made use of in practical applications of Fourier trans- 
forms. 


12.3 Discrete Fourier Transform 

12.3.1 Definitions 

The present section includes several topics associated with numerical com- 
putation of Fourier transforms. Generally, in computational work, we do not 
treat a continuous function fit), but rather /(f„) given by a discrete set of 
t n ' s. (For now, we assume that a physical process of interest is described in 
the time domain.) In most common situations, the value of /(f) is recorded 
at evenly spaced intervals. In this context, we have to estimate the Fourier 
transform of a function from a finite number of its sampled points. 

Suppose that we have a set of measurements performed at equal time 
intervals of A. Then the sequence of sampled values is given by 

fk = t k =kA (A; = 0,1,2,--- , IV — 1). (12.30) 

For simplicity, we assume that N is even. With N numbers of input, we can 
produce at most N independent numbers of output. So, instead of trying to 
estimate the Fourier transform F(u>) in the whole range of frequency u>, we 
seek estimates only at the discrete values to = ui n with n = 0, 1, • • • , N — 1. 
By analogy with the Fourier transform for a continuous function /(f), we may 
define the Fourier transform for a discrete set of /*, = /(ffc) (A: = 0,1,--- TV— 1) 
as below. 


6 Discrete Fourier transform: 

The discrete Fourier transform for a discrete set of fk given by (12.30) 
is defined by 

JV-l JV-l 

F n = F(oj n ) = -£ f(tk)e~ iu " tk = -£ fke- 2 * ikn/N , (12.31) 

k—0 k = 0 

with the definition 

y.'jrrr 

co n =— (n = 0, 1, • • • ,N — 1). (12.32) 


Note that F n is associated with frequency u> n . Of importance is the fact that 
in (12.31), n can be any integer from — oo to oo, whereas k in (12.31) runs 
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from 0 to N — 1. The latter restriction is due to the fact that F n is periodic 
with a period of N terms. In fact, for any integer n such that 0 < n < N — 1, 
we have 

F n — F n ±N = F n ±2N — ' ' ' 
as readily follows from (12.31). 


12.3.2 Inverse Transform 

Given the discrete transform F n , we can reproduce the time series fk with the 
aid of the inverse relationship: 


4 Inverse of discrete Fourier transform: 

The discrete Fourier transform of a set {fk} satisfies the relation 

N-l 

fk=J2 F n e^ ikn/N . (12.33) 

n — 0 


Proof For the proof, it suffices to observe that 


N-l 

^ ^ g—2-Kin(k—k')/N 
n=0 


N ( k = k '), 

0 (otherwise) . 


(12.34) 


(see Exercise 1 in Sect. 12.3). Then, from (12.31) and (12.34), we have 


N-l 

F„e 27ri " fe, / JV 

n — 0 


N-l N-l 

N H 


j — 2nin(k—k')/N 


n = 0 k—0 


N-l 

fk • N6kk’ = fk' ■ & 

k = 0 


Note that the only differences between expressions (12.31) and (12.33) for 
F n and f )- , respectively, are (i) changing the sign in the exponential, and (ii) 
dividing the answer by N. This means that a computational procedure for 
calculating discrete Fourier transforms can, with slight modifications, also be 
used to calculate the inverse transform. In addition, we see from the inverse 
transform that only N values of the frequency u> n are needed and that they 
range from 0 to IV — 1, just as with the discrete time tk ■ 
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12.3.3 Nyquest Frequency and Aliasing 


In the above discussion, we have taken the view that the index n in (12.31) 
varies from 0 to N. In this convention, n in F n and k in f)~ vary over exactly 
the same range, so the mapping of N numbers into N numbers is manifest. 
Alternatively, since the quantity F n given in (12.31) is periodic in n with 
period N (i.e., F n = F N+n ), n in F n is allowed to vary from —N/2 to (N / 2) — 
1. In the latter convention, the discrete Fourier transform and its inverse 
transform read, respectively, 

N/ 2-1 JV/2-1 

F n = fke~ 2nikn/N and f k =- £ F n e 2mkn ' N . (12.35) 

k— — N/2 n=-N/2 


Emphasis is placed on the fact that in (12.35), the upper bound of the 
summation is not N/2 but (N/2) — 1. This ensures the count of u) n to 
N. Indeed, the periodicity of F n in n with the period N implies that the 
descretized frequency LU n = 2 'ktl/(N A) is also periodic in n with N . Hence, 
the two extreme values of co n , i.e., 


J ~N/2 


= — - and 
A 


w JV/2 


7 r 

A’ 


contribute to F n as given in (12.31) in the same way. These two indistinguish- 
able frequencies are known as the Nyquist critical frequencies. 


4 Nyquist critical frequency: 

A Nyquist critical frequency is defined by 

7T 



where A is the sampling interval: t k = kA (k = 0, 1, • • • , N — 1). 


The Nyquist critical frequency has the following peculiarity. Suppose that we 
sample a sine wave of the Nyquist critical frequency, expressed by 


f(t) = sin (u c t), 


at the sampling interval A. Then we have 


fk 


f(t k ) = sin (iO c tk + 9) = sin (kA + 9) 
(k = 0,1,--- ,N- 1), 


sin(fc7r + 9) 


where 9 is determined by the initial condition: /( 0) = sin0. Then, the 
sampling becomes two sample points per cycle: sin0 and — sin(9. 

The above arguments further suggest that descretized frequencies above 
(and below) co c are identified with u) n -N (and lOu+n)- This phenomenon, 
peculiar to discrete sampling, leads to the following important consequence: 
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6 Aliasing: 

When a continuous function f(t) is sampled with an interval A, all of the 
power spectral density lying outside of the range [— u) c ,u c ) with u> c = n/A 
is moved into that range. Owing to a phenomenon called aliasing. 

Through discrete sampling, therefore, any frequency component outside of the 
range [— oj c ,lo c ) is falsely translated into that range. 

Example Suppose that two continuous waves exp(zwit) and exp(iu> 2 t) are sam- 
pled with the same interval A. Then, if W 2 = wi ± 2 u> c , we obtain the same 
samples, since 

exp(zw2tfe) = exp(iwiffc) x exp(±2 ioj c tk) 

= exp(zwiffc) x exp(±2/c7ri) = exp(iu>itk), 

where tk = kA (k = 0, 1, - • - ,N — 1). Hence, a sinusoidal wave having a 
frequency lying outside the range [— u c ,lo c ) appears the same as the sinusoidal 
wave whose frequency is within the range. 


Remark. The way to overcome aliasing is to (i) know the natural bandwidth 
limit of the signal - or else enforce a known limit by analog filtering of the 
continuous signal, and then (ii) sample at a rate sufficiently rapid to give at 
least two points per cycle of the highest frequency present. 


12.3.4 Sampling Theorem 

We present below a famous theorem that is useful in certain applications of 
the discrete Fourier transform. 


4 Sampling theorem: 

Suppose that a continuous function f(t) is sampled at an interval A as 
fk = f(kA). If its Fourier transform satisfies the condition that E(cu) = 0 
for all M > u c — 7 t/A, then we have 


OO 

m= E /* 

k =— oo 


sin [to c {t — kA)] 
7T (t — kA) 


This theorem states that if a signal f(t) that is in question is bandwidth- 
limited (i.e., F(u>) = 0 for |w| > |o;o|) with a certain preassigned frequency u>o, 
then the entire information content of the signal can be recorded by sampling 
it at the interval A = n/coo. 
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Proof Given a continuous function /(£), we express it by the inverse Fourier 
transform as 

i r°° 

m = ~f= / F(w)e^du. 

V J — oo 

From hypothesis, F(u>) vanishes at w > |w c | so that 

1 r Uc 

m = ~^= / F(u)e^du, 

V J — oj c 

which yields 

f(tk) = j— [ F{uj)e lutk du} for tk = kA (k e Z). 

V27T J —to c 

Consider the Fourier series expansion of F(oj) as 

OO 

F(uS) = ^2 Cke~ lwtk for |w| < w c , (12.36) 

k=— oo 

where the coefficients Ck read 

c fe = r F(w)e iutk du) = f(t k ). (12.37) 

V27T J — U j c 

From (12.36) and (12.37), we obtain 

OO 

F(u>) = Y f(tk)e~ iutk for |w| < u c . 


Now we define 

OO 

H(u>) = ^2 f(tk)e~ zu;tk for all uj. 

k =— oo 

While the function H(u>) is a periodic function with period 2u) c , the F(oj) is 
identically zero outside the interval [—lo c ,lo c ]. This being so, we can write 

F H = with S(w) = | q J^J “ 221 


OO 

F(w) = Y f(tk)e- iutk S(u), 

k— — oo 


Thus we have 
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and its inverse transform reads 

/»oo 


m = 


i 


E f(tk)e- iutk S(u >) 

,k= — OO 

1 




J- 

OO 

00 1 poo 

= E /(*k)-*p / 

t , , yj ZlT J —oo 


e lut duj 


k=—oo 

oo 


= E /(*fc) 


k=—c 


sin[o; c (t - ffc)] 
io c (t-t k ) 


12.3.5 Fast Fourier Transform 

The fast Fourier transform (often abbreviated by FFT) is an algorithm for 
calculating discrete Fourier transforms and is widely known as a useful tool 
in computational physics. In this subsection, we demonstrate the efficiency of 
this computational method. 

In a typical discrete Fourier transform, one has a sum of TV terms expressed 

by 

N—l 

Fn=J2 Wnk h, (12-38) 

k — 0 

where IT is a complex number defined by 

IT = e 2ni / N . 

Notably, the left-hand side of (12.38) can be regarded as a product of the 
vector consisting of the elements {/&} with a matrix whose (n, fc)th element 
is the constant IT to the power n x k. The matrix multiplication produces 
a vector whose components are the F n ’ s. This operation evidently requires 
TV 2 complex-number multiplications plus a smaller number of operations to 
generate the required powers of IT. Thus, the discrete Fourier transform ap- 
pears to be an 0(N 2 ) process. 

The efficiency of the fast Fourier transform manifests in the fact that it en- 
ables us to compare the discrete Fourier transform in 0(TVlog 2 TV) operations. 
The difference between 0(N 2 ) and 0(TVlog 2 TV) is immense. With TV = 10 8 , 
e.g., it is the difference between, roughly, 2 s and 3 months of CPU time on a 
gigahertz cycle computer. 

The fast Fourier transform is based on the fact that a discrete Fourier 
transform of length TV can be rewritten as the sum of two discrete Fourier 
transforms, each of length TV/2. This is easily seen from (12.38) as follows: 
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F n = Y, ^ ikn ' N fk 

fc = 0 

N/ 2-1 N/2-1 

= Y e 2vin{2k)/N f 2k + Y e2win{2k+1)/N hk+\ 

k—0 k—0 

N/2-1 N/2-1 

= Y e 2 ™ nk/( - N/2) f 2k + W n Y e 2nmkl{N/2) hk+i 

k - 0 k—0 

= F ' e + W n F° 


(12.39) 


Here W is the same complex constant we defined in (12.38). The F® denotes 
the nth component of the Fourier transform of the sequence ( f 2k ) with length 
N/2 expressed by 

(/2fc) = (Zo, /2, /4, • • • , f N — 2 ) 1 

which consists of even components of the original f k s. Similarly, the F° is the 
corresponding transform of length N/2 formed from odd components. Recall 
that F n is periodic in n with the period N. On the other hand, the transforms 
F® and F° are periodic in k with length N/2. This period-reduction property 
is the origin of the efficiency of the fast Fourier transform as demonstrated 
below. 

Having decomposed F n into F® and F°, we can apply the same procedure 
to F® and F° to produce N / 4 even-numbered and odd-numbered data: 

JV/4-1 iV/4— 1 

F®= Y e 27Tink/{N/4) f 4k + W k Y e 2 " nfe/(JV/4) /4fc+2 


= pee 1 11 /n peo 
± n ' ' y ^ n 1 


(12.40) 


N/ 4-1 


k= E 


0 2'Kink/ (N / 4) 


h k+ i + w n y 


D 27r ink / (N / 4) . 


= F°® + W n F° 


(12.41) 


Here, the F//°, e.g., is the transform of the sequence {f 4 k + 2 ) given by 

(/4fc+2) = (/2, foj ■ ■ ■ , fN- 2 ), 

whose length is N/ 4. We can continue the above procedure until we obtain 
the transform of a single-point sequence, say, 


peoeeoeo-oee = for some k (12.42) 

This implies that for every pattern of log 2 N e’s and o’s, there is a one-point 
transform that is just one of the input numbers f k ■ Therefore, by relating 
all the terms f k (0 < k < N — 1) to log 2 N patterns of e’s and o’s and 
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then tracking back to the procedures (12.39), (12.40), (12.41), and (12.42) to 
reproduce F n , we will successfully obtain the discrete Fourier transform F n 
(0 < n < N — 1) of the original data fk (0 < k < N — 1). 

One may ask a question as to the way we can figure out which value of 
k corresponds to which pattern of e’s and o’s in (12.42). As we demonstrate 
later, this can be achieved by reversing the pattern of e’s and o’s and setting 
e = 0 and o = 1. Then, we have the corresponding value of k in a binary 
expression. This idea of bit reversal can be exploited in a very clever way 
that makes FFTs practical. 

12.3.6 Matrix Representation of FFT Algorithm 

To make our discussion more concrete, we now present an actual FFT pro- 
cedure to obtain the discrete Fourier transform F(n) (n = 0,1,2, 3) of the 
original vector data f(k) ( k = 0, 1,2,3). By definition, F(n) is given in the 
matrix representation as 

’F(O) 1 [ W° W° W° W° 

F(l) W° W 1 W 2 W 3 

F(2) ~ W° W 2 W 4 W e 

_F(3)J L W°W 3 W 6 W 9 

where we used the fact that 

W 4 = (e 27ri/4 ) 4 = e 2ni = 1; 

More generally, we have 

^mk yynk mod(iV) 

where the number 

nk mod(A) 

is the remainder when the integer nk is divided by N. The trick involved in 
the FFT algorithm is to decompose the product of the vector and the matrix 
appearing in (12.43) into that of a vector and two matrices: 

' F(0) 1 [1 W° 0 0 

F( 2) _ 1 IF 2 0 0 

F(l) “ 0 0 1 IF 1 

_F(3)J Lo 0 1 IF 3 

The equivalence between (12.43) and (12.44) is verified in a straightforward 
manner. Nevertheless, the reader should pay attention to the fact that in 
(12.44), the order of elements in the vector F[n ) is altered from that in the 


1 0 IF 0 O' 

0 1 0 IF 0 

1 0 IF 2 0 

0 1 0 IF 1 


■ r/(o)i ri 1 1 11 r/(o)' 

/( 1 ) _ 1 W 1 W 2 W 3 /( 1 ) 

/( 2 ) = 1 W 2 W° W 2 /( 2 ) ’ 

[/(3)J Ll W 3 W 2 W 1 \ L/(3) _ 

(12.43) 
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original form (12.43). As we demonstrate later, this altering property of the 
order of F(n) enables us to compute efficiently the F(n) from f(k) with the 
help of the bit-reversing process. 

The efficiency of FFT can be observed by counting up the number of 
multiplication (and additions) between matrix elements in order to complete 
the matrix operation given in (12.44). First we set 


Hoy 


' 1 0 W° O' 


7o(o)' 

m 


0 1 0 w° 


/o(l) 

A( 2 ) 


1 0 w 2 0 


/o( 2 ) 

A(3). 


.010 w\ 


_/o(3)_ 


in which fo(k) = f(k) ( k = 0,1, 2, 3). Then /i(0) is obtained through one 
complex-number multiplication and one complex-number addition, i.e. , 

/i(0) = / o (0) + lT°/o(2). (12.45) 

We can obtain /i(l) in the same manner as above. On the contrary, to obtain 
/ 1 (2) , only one complex- number addition is needed due to the relation W 2 — 
—W°. In fact, 


/i( 2) = /o(0) + IF 2 /o(2) = / o (0) - W°f 0 {2), 

in which the product W°/o( 2 ) was evaluated earlier in the calculation of 
(12.45). Likewise, /i(3) is also computed by only one addition owing to the 
relation W 3 = — W 1 . As a consequence, the vector fi(k) ( k = 0, 1,2,3) is 
calculated through four-times additions and two-times multiplications. 

A similar scenario can apply to the remaining computation: 


F(oy 


72(0)- 


'1 W° 0 0 " 


7i(0)- 

F( 2) 


/ 2 (1) 


1 w 2 0 0 


/i(!) 

A(l) 


M 2) 


001 w 1 


/i(2) 

n 3 ). 


72(3). 


.0 0 1 w 3 _ 


7i(3). 


Calculation of each number / 2 (C)) and .( 2 ( 2 ) requires both one addition and 
one multiplication, whereas for / 2 ( 1 ) and / 2 ( 3 ) only one addition is required 
because of the relations W 2 = —W° and W 3 = —W 1 . Therefore, the entire 
computation to yield F(n) in the above context requires four-time multiplica- 
tions and eight-time additions. This computational cost is significantly small 
compared with the direct matrix calculation given in (12.43), where 16-times 
multiplications and 12-times additions are needed. More generally, when con- 
sidering the transform F(k) of the length N = 2 7 , the FFT procedure requires 
the multiplications of N'y/2 times and the additions of IV 7 times, whereas the 
direct matrix calculation procedure demands 7V 2 -times multiplications and 
N(N — l)-times additions. Thus the superiority of FFT method is consider- 
ably enhanced when N » 1. 
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12.3.7 Decomposition Method for FFT 

It is still unclear as to how we can find an appropriate decomposition of general 
TV x N matrices as performed in (12.44). To see this, we express the indices 
n and k in terms of two-digit expressions: 

n = 2ni + no, k = 2k\ + ko, 

where each m, no, ki, ko takes the value 0 or 1 [e.g., n = 3 corresponds to 
(no, ni) = (1,0)]. Then, the discrete Fourier transform reads 

l l 

F(m,n 0 ) = ^ ^ /o(fci,fco)W (2ni+no)(2fel+feo) . 

ko — 0 ki—0 


Now we apply the identity 


pj/(2m+rao)(2fci+fco) _ yy4mki pj/2n 0 fei ^y(2m+no)ko _ p^/2nofci pp-(2m+rao)fco 


to obtain 


i 

F(m,no) = ^2 

feo=0 


5] fo(ki,ko)W 2n ° kl 

_ki—0 


p^(2ni+n 0 )feo 


Denoting the sum in the square bracket by /i(no,fco), we have 


l 

/i(n 0 ,fc 0 ) = E fo{ki, ko)W 2n ° kl , 

ki=0 


(12.46) 


(12.47) 


or equivalently, 


A (0,0) = /o(0,0) + /o(l,0)IF 0 , 

A(0, 1) = Zo(0, 1) + /o(l, l)w°, 

A (1,0) = /o(0,0) + /o(l,0)W 2 , 

A(l, 1 ) = /o( 0 , 1 ) + /o(l, l)w 2 . 

This system of equations is expressed in matrix form as 


A(o,o)' 


' 1 0 W° O' 


7o(o,o)' 

A (o,i) 


0 10 IF 0 


/o(0, 1) 

A (i.o) 


1 0 W 2 0 


/o(l, 0) 

A(i,i). 


.010 w 1 


_/o(l,l). 
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Similarly, from (12.46) and (12.47), it follows that 


/ 2 (0,0)- 


'1 W ° 0 O' 


7i(0,0)- 

/ 2 (0,1) 


1 w 2 0 0 


/i(0,l) 

/ 2 (1,0) 


001 w 1 


/i(l,0) 

/ 2 ( 1,1). 


.001 w 3 _ 




Hence, we have 

F(m,no) = /2(no,ni), 

in which the order of n o and rq in the parentheses differs on the two sides. 
This indicates that the individual numbers f '2 (no , ni ) are in order not of 
n = 2ni + no, but of the numbers obtained by bit-reversing n, which is why 
the bit-reversing process is required to obtain the discrete Fourier trans- 
form F(n) using FFT, The above discussion also clearly demonstrates the 
way to construct the decomposed product of matrices that makes the entire 
computations a fast. 


Exercises 


1. Show that 


N - 1 

^ 2ixin(k— k')/N 
n=0 


N ifk = k', 

0 otherwise, 


where k and k! are integers ranging from 0 to N — 1. 


Solution: The proof for the case of k = k' is trivial. When k = k ', 
then 

e -27T in(k-k')/N ^ &n( j 6 ~2tt in(k-k’) _ ^ 

for any choice of k and k' , so that we have 


JV-l 

^ ^ ^—2 / nin(k—k')/N 
71=0 


g— 27 t in(k—k) 

^ g—2ivin(k—k')/N 


0. 4 


12.4 Applications in Physics and Engineering 

12.4.1 Fraunhofer Diffraction I 

In optics, Fourier transformation is a powerful tool to describe an important 
class of wave diffractions, called Fraunhofer diffraction; this refers to the 
diffraction of electromagnetic radiation observed at a point far from a slit or 
an aperture. A Fraunhofer diffraction pattern can be described by using the 
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wave theory of light, which predicts the areas of constructive and destructive 
interference. 

Let us derive the diffraction pattern produced by a rectangular aperture 
with width a and height b. We assume that both incident and diffracted waves 
can be approximated as being plain waves with wavelength A. In order to make 
this assumption, the diffracting obstacle and the observation point must be 
sufficiently far from the light source so that the curvature of the incident and 
diffracted light can be neglected (see Fig. 12.1). According to elementary wave 
optics, the amplitude of light at R on the screen is given by 



Fig. 12.1. Configurations of the light source, a recrangular aperture, and the screen 


u(R) = — [ u(r')e lk ^ r ^dr' . 

2 tt-R J AS' 

Here, k = 27r/A, AS' represents the area of the rectangular aperture through 
which light passes and u(r') is the amplitude of the incident wave at r' within 
the aperture: 

u(r') = Ae ikr '. 

We assume that this incident wave is oriented in the direction of the 2 -axis. 
Then, the wave vector k is perpendicular to the position vector r' so that 

u(r') = A = const. 

Hence, we have 

u(R) = ~— ( dx' [ dy'e ik \ Rr '\. (12.48) 

2ttR J_ a J_ b 


Set R = ( x , y , z) and r' = (x ’ , y’ , 0), where the origin is located at the center 
of the aperture. Under the assumption that 2 \x\, \y\ and |ar| , |y| » \x'\, \y'\, 

we have 
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l-R — r'\ = \J z 1 + (x — x ') 2 + (y — y') 2 

= \J R 2 — 2{xx' + yy') + x' 2 + y' 


~ R 1 - 


/ i / / 2 . / 2 

xx + yy x + y 


R 2 2i? 2 

\ 

Substituting this into (12.48) yields 


~ R 1 - 


xx' + yy' 

1Y 2 


YR) = - 


ikA 

2ttR C 


AkR 


e-^'dx’ 


e R v dy' 


i-b 


UkA kR sin 

7 TR 


kax . kby 

Sill 

R R 


kx ky 

Yd YY 

The light intensity distribution I(R) on the screen is thus given by 

1 2 


I(R) = |u(fl) 


where 


X= R' V= R 


sin(fcaai) sin (kby) 
kax kby 

V 


Remember that (sin£)/£ = 0 at ^ = ±n7r with integers n = 1,2,- 
addition, since k = 27t/A, we conclude that 

I(R)= 0 ati=±^ or y = ±^, (m,n= 1,2,---), 

2 a 2b 


. In 


which describes the diffraction pattern generated on the screen. 


12.4.2 Fraunhofer Diffraction II 


We next consider the case of a circular aperture with radius a. For convenience, 
we use the polar coordinates defined by x = rcosO , y = rsind. Then (12.49) 
reads 


i(R) 


U[JrC) oc e 


ikR 


exp 


r AS' 


dr' 


r a 

/•2tt 

/ dr' 

/ dO'r' exp 

Jo 

Jo 

r a 

/•2n 

/ dr' 

/ d9'r' exp 

Jo 

Jo 


—ik(xx' + yy') 

R 

—ikrr'( cos 6 cos 9' + sin 9 sin 9') 
R 

—ikrr’ cos (9' — 9) 

Yt 


To make it consise, we use the following formulae based on the Bessel func- 
tion J n (x): 
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a iC, cos <f> ji = 


= 27tJ 0 (C) 5 / CMO = vJi(v)- 


These give us 


u(R) ex 27ra 2 


( kar\ 

J, i A± 

kar 

R 


where the explicit form of Ji(a;) is obtained from the definition of J v (x), 


■«*> = (!) £/ 

t=o 


£!/> + f + 1) : 


and thus lim x ^o Ji{x)/x = 1/2. The first zero of J\(x) is located at x ~ 1.227T. 
Therefore, the radius ro of the innermost dark ring on the screen is given by 


karo 

nr 


~ 1.227T, 


i.e., r 0 


OMXr 

a 


12.4.3 Amplitude Modulation Technique 


We conclude this chapter with a discussion regarding the use of Fourier trans- 
formations in an amplitude modulation (AM) technique. This technique 
is used in electronic communication, most commonly for transmitting infor- 
mation via a radio carrier wave. As the name indicates, AM works by mod- 
ulating the vibrational amplitude of the transmitted signal according to the 
information being sent. This is in contrast to the frequency modulation 
(FM) technique that is also commonly used for transmitting sound, but by 
modulating its frequency. 

For AM, we use two kinds of waves: a carrier wave c(t) and a message 
wave m(t) that contains information on the message to be transmitted. For 
simplicity, the carrier wave is modeled here as a simple sine wave written as 

c(t) = C • cos (u> c t + 

where the radio frequency (in Hertz) is given by w c /( 2tt). C and (j) c are con- 
stants representing the carrier amplitude and the initial phase, respectively, 
and their values are set to 1 and 0. AM is then realized by determining the 
product: 

y(t) = m(t) ■ c(t), 

whose Fourier transform Y (w) is expressed as 


Y(ui) = 

1 


Q —iuj c t 


a/27T J-c 


m(t)e 


-dt 


1 

2 


[Af {oj T u)q) T M{lo — u c )\ . 


(12.49) 


Here M (u>) is the Fourier transform of m(t) . 
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Frequency 


Fig. 12.2. Top: A carrier wave c(t) = sin(o; c t) with oj c = 5.0 and a message wave 
m(t) = 2 exp[— (t — to) 2 /4] with to = 1.5. Middle: The products c(t)m(t) = y(t) and 
(?{t)m{t). Bottom: The power spectra |-7 r [2/(£)]| 2 and | lF[c(t)y(t)]\ 2 
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The result in (12.49) implies that the modulated signal y(t) has two groups 
of components: one at positive frequencies (centered at +u> c ) and one at neg- 
ative frequencies (centered at —to c ). Figure 12.2 illustrates a carrier wave 
c(t) = sin (w c t) with lo c = 5.0, a message wave m(t) = 2exp[— (t — to) 2 /4] with 
to = 1.5, and the power spectrum of y(t) = c(t)m(t) [i.e., cc-dependence of 
Y(w)] described by (12.49), together with the associated message wave m(t). 
The frequency shift from w to w ± w c , which is clearly evident, facilitates the 
tuning of the frequency of the transmitted signal to the desired value. We are 
concerned only with positive frequencies. The negative ones are mathematical 
artifacts that carry no additional information. 

In order to reproduce the original signal m(t) from the modulated one 
y(t), it is sufficient to multiply c(t) by y(t) and follow that with a filtering 
process. The Fourier transform of the product c(t)y[t) is given as 

T[c{t)y{t)\ = T[m(t) cos 2 (w c t)] 

= | [M (u + 2 w c ) + M(u>- 2u> c )] . 

We pick up the first term in the last expression and take its inverse transform, 
thus obtaining T~ x [M(u>)\ = m(t). 
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Abstract Using the Laplace transform for the mathematical description of a phys- 
ical system considerably simplifies the analysis of its behavior Many useful applica- 
tions and formulas related to Laplace transforms can be found in other textbooks, 
but here we focus on the theoretical background, particularly, on the convergence 
properties of the various forms of Laplace transforms. It is important to note that a 
Laplace transform exists only if the corresponding improper integral, known as the 
Laplace integral, converges. Hence, the convergence of the improper integral must 
be confirmed prior to discussing the Laplace transform of a given function. Thus we 
devote a portion of this chapter to an analysis of the conditions necessary for the 
convergence of Laplace integrals, in contrast to the standard literature that deals 
primarily with the practical applications of Laplace transforms. 


13.1 Basic Operations 

13.1.1 Definitions 

The Laplace transformation associates a function /( x) of a real variable x 
with a suitable function F(s) of a complex variable s. This correspondence is 
essentially a reciprocal one-to-one and often allows us to replace a given com- 
plicated function by a simpler one. The advantage of this operation manifests 
particularly in applications to problems of linear differential equations (see 
Chap. 15). We shall see that the Laplace transformation allows us to reduce 
a linear differential equation of f(x) to a certain simple algebraic equation of 
F(s), which yields solutions of the original differential equations more readily 
than other techniques. Furthermore, it turns out that this reduction method 
can be extended to systems of differential equations (ordinary and partial) as 
well as to integral equations, which enhances the importance of studying and 
understanding the Laplace transform. 

To begin with, we define the Laplace transformation operator L that maps 
a function f(x) to a corresponding function F(s): 
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4 Laplace transformation: 

The (one-sided) Laplace transformation, denoted by the operator L, 
is defined by 

/»oo 

L[f(x)]= e~ sx f(x)dx = F(s), (13.1) 

Jo 

which associates an image function F(s) of the complex variable s = a + ico 
with a single-valued function f(x) (x real) such that the integral (13.1) 
exists. 

Laplace integral: 

The integral given in (13.1) is called the Laplace integral. If the 
Laplace integral exists for a given /( x), the image function F(s) is called 
the (one-sided) Laplace transform of f(x). 

It is important to keep in mind the difference between the Laplace integral 
and the Laplace transform. Namely, the Laplace transform exists only when 
the Laplace integral exists (i.e., converges). Convergence properties of Laplace 
integrals are determined by the value of s and the feature of the function f(x), 
which is discussed fully in Sect. 13.3. In the meantime, we assume that f(x) 
is a function that allows the Laplace integral to converge for certain s. 

13.1.2 Several Remarks 

Below are several important remarks regarding the properties of the Laplace 
transform (13.1). 


1. The definition (13.1) states that for a given F(s), there is at most one con- 
tinuous function f(x). Nevertheless, it does not determine a unique f(x) 
because if f(x) in (13.1) were altered at a finite number of isolated points, 
F(s) would remain unchanged, as such discontinuous points make no con- 
tribution to the integral. For this reason, we assume in the remainder of 
this chapter that /( x) is continuous except at isloated points. 

2. In order for the integral (13.1) to exist, any discontinuity of the integrand 
inside the interval (0, oo) must be a finite jump so that there are right- 
hand and left-hand limits at those discontinuous points. An exception is a 
discontinuity at x = 0 (if it exists); for instance, the function f(x) = 1 / y/x 
diverges at t = 0 but the integral (13.1) exists. 

3. The inverse Laplace transform of F(s) is a function f(x) such that 
L[f(x )] = F(s). Hence, the operation of taking an inverse Laplace trans- 
form is denoted by 2A 1 and we have 




13.1 Basic Operations 409 


L~ 1 [F(s)]=f(x). 


This expression implies the possibility of dealing with the operators L 
and 7A 1 algebraically, just as the equation ax = y can be rewritten as 
x = a~ l y. At thus point, it is not clear as to how the inverse operation 
L -1 is to be performed, but actual manipulations are discussed in detail 
in Sect. 13.4.2. 


4. Not every function F(s) has an inverse Laplace transform. A sufficient 
condition for F(s) to have its inverse transform is presented in Sect. 13.4.2. 

13.1.3 Significance of Analytic Continuation 

Observe that the Laplace integral (13.1) involves a complex- valued term e~ sx 
in its integrand, which makes it difficult to employ the standard methods 
of integration that are applicable to real integrands. One way to proceed 
would be to use the equation e~ sx = e~ ax cos uix — ie~ ax sin uix, which yields 
two real integrands. This is, however, more complicated than necessary. An 
easier method is to make use of the following theorem, which is verified in 
Sect. 13.3.7: 

4 Analytic property of Laplace transform: 

The Laplace transform F(s), which is a complex- valued function of a 
complex variable s, is an analytic function in a region of Re (s) > tr c 
with a specific real number a c . 


| Remark. Just at Re(s) = cr c , however, no general conclusion can be drawn. 

This theorem states that once the value of F(a) on the real axis is known, 
F(s) on an arbitrary point of the complex plane can be obtained by simply 
replacing a by s. This replacement is based on an analytic continuation 
from the semi-infinite line of the real axis, a > a c , to the right half of the 
s-plane, Re (s) > a c , which is why we can perform the integration (13.1) as if 
s were a real variable. Several examples given later clearly show the efficacy 
of identifying s as a real parameter. 

At first glance, the formality of replacing u by s amounts simply to a 
change in symbol. But, without analytic continuation, we could no longer 
regard our replacement from a to s as a mere formality; i.e., the concept of 
analytic continuation lurks in the background. 

Remark. In particular, those cases in which F(s) becomes multivalued cannot 
be treated without paying heed in detail to the difference between a and s. 
The latter issue regarding multivalued F(s) is discussed in Sect. 13.2.5. 
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13.1.4 Convergence of Laplace Integrals 

Emphasis is placed on the fact that the Laplace integral (13.1) may or may 
not exist depending on the value of s as well as the nature of f(x). A sufficient 
condition for the Laplace integral to converge is that the real component of 
s, Re(s), is greater than a specific value. This intuitively follows from the 
definition (13.1) that says if the integral (13.1) exists for 


so — °o + 


then the integral also exists for every s such that Re (s) > ao, since in the 
latter case 

|g-sx| < | e - So x| _ g-^ox 

This is stated rigorously in the theorem below. 


4 Convergence of Laplace integrals: 

If the Laplace integral 

f(x)e~ sx dx (13.2) 

converges for Re (s) = ao, then it also converges for Re (s) > <jq. 



The proof is given in Sect. 13.3.4. This theorem implies the existence of a 
specific real number a c such that the integral (13.2) converges for Re (s) > a c 
and diverges for Re (s) < a c (see Fig. 13.1). The number a c is called the 
abscissa of convergence of the Laplace integral, whose value depends on 
the nature of the function f(x). With this notation, we say that the region 
of convergence of the Laplace integral is a half-plane to the right of 
Re (s) = a c . This region of convergence is of course identified with the defining 
region of the Laplace transform F(s). 

I Remark. By definition, a c may take — oo (or oo), which means that the integral 
(13.2) converges (or diverges) for all a. 


Examples Set f(x) = 1 for every x > 0. Then 


L[f{x)} 


e~ sx dx = lim 

X — »oo 


lim 

X — >-oo 


X 


J 0 



s~ sx dx 


- - lim e~ sX . 

S X—too 


—S 


S 
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Hence, we have 

L[f(x)] = - for s > 0. 

For s < 0, the integral does not converge. This indicates that in this case 
a c = 0 . 


Im.v 


o c 


Re v 


Fig. 13.1. The abscissa of convergence a c to the right of which the Laplace integral 
converges 


13.1.5 Abscissa of Absolute Convergence 

When the Laplace integral converges in the ordinary sense, it might converge 
absolutely in part or in all of its converging region. (Remember that the con- 
ditions for absolute convergence are more stringent than those for ordinary 
convergence). This leads us to define an abscissa of absolute convergence as 
follows: 


4 Abscissa of absolute convergence: 

Suppose that the Laplace integral (13.2) converges absolutely for 
Re(s) = cto as 



dx = 



\f(x)\e a ° x dx < oo. 


(13.3) 


The greatest lower bound cr a of such a cr Q that satisfies (13.3) is called the 
abscissa of absolute convergence of the Laplace integral (13.2). 


Thus once o a is determined, we say that the integral (13.2) converges ab- 
solutely for a > cr a , does not converge absolutely for a < a a , and may or 
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may not converge absolutely at cr = a a . Since absolute convergence implies 
ordinary convergence, it is clear that 


cr c tr a • 

The following example shows that a a does not generally coincide with a c (see 
Fig. 13.2). 


Ims 






Re 5 


Fig. 13.2. The abscissa of convergence a c and the abscissa of absolute convergence 

C a 


Example f{x) = e x sine x 
Set u = e x \ then we have 

nOO pOO • 

F{s)=J e~ sx e x sin e x dx = J du . 

The integral converges absolutely for Re (s) = a > 1, converges conditionally 
for 0 < a < 1, and diverges for a = 0. Hence, we have 

<j c = 0 and <7 n = 1, 

which clearly indicates that in this case cr c ^ <j a . 


13.1.6 Laplace Transforms of Elementary Functions 

Let us evaluate the Laplace transforms F(s) of several classes of elemen- 
tary functions. We treat the complex variable s as if it were real, bearing in 
mind that this formalism is based on the analyticity of F(s), as discussed in 
Sect. 13.1.3. The defining region of each F(s) is found on the right-hand side 
of the equation in question. 

1. f(x ) = x n , where n is a positive integer. 


Integrating by parts, we have 
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/»00 

F(s) = L[x n ] = / x n e~ sx dx 
Jo 


i 00 „ /-OO 

= — + - / t n ~ 1 e~ sx dx. (13.4) 

s Jo s Jo 

Since s > 0 and n > 0, the first term in the last expression of (13.4) 
vanishes. Iteration of this process yields 

, = jh 

L J s n L J <jn+l 

since L[t°] = L[ 1] = 1/s. As a result, we have 

F{8) = L[x”] = -£ t (ct > 0). 

2. /(a;) = e aa: , where a is a real constant. 

F(s) = L[e ax ] = f e~ sx e ax dx = (ct > a). 

Jo s — a 

3. f{x) = sin ax, where a is a real constant. 

Integrating by parts twice, we obtain 

pOO 

F(s) = L[smax] = / e~ sx sin axdx 

Jo 

g — SX " 00 2 f°° 

= — cos ax H — / (—s)e~ sx cos axdx 

a Jo a Jo 


1 S (\e 


a a ^ L a 
1 s 2 

= - - -oF{s), 


" 00 s f°° 

sin ax H — / e~ sx sin axdx 

Jo a Jo 


where we have used the fact that as s is positive, e sx — > 0 as x — ■> 00 , 
whereas sin ax and cos ax are bounded as x — » 00 . Eventually, we set to 

F(s) = L [sin ax] = ° . (ct > 0). 

s z + a z 


In a simiar manner, we obtain 


_ s 

Llcosax] = / e sx cos axdx = — (ct > 0). 

Jo s + a 
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4. f(x) = cosh as, where a is a real constant. 


Using the linearity property of the Laplace transform operator L, we ob- 
tain 


L [cosh ax] = L 



l -L[e™]+ l -L[e 


1 / 1 1 \ 
2 \s — a s + a J 



0 > |a|). 


Exercises 

1. Show the linearity of the Laplace transformation operator L. 

Solution: It follows from the definition of the operator L that 

/*oo 

L[df(x) + c 2 g(x)\ = / e~ sx {df(x) + c 2 g(x)}dx 
Jo 


= Ci 


e sx f(x)dx + c 2 


*g{x)di 


= dL[f(x)} +c 2 L[g(x)\, 

where Ci and c 2 are arbitrary constants. This clearly shows the linearity 
of the operator L. X 


2. Find the Laplace transform of the function, 

/O) = 


0, 0 < x < c, 

1, x > c. 


/“OO /»oo 

Solution: L[f(x)\ = J e~ sx f{x)dx = J e~ sx dx = e~ cs / s ( a > 0). 


3. Show that if f(x) is real and F( x) = L[f(x)\ is single- valued, then F(s ) 
is real. 

pOO 

Solution: Set s = a > cr c in the equation F(s) = / f(x)e~ sx dx. 

Jo 

Then the integrand f(x)e~ ax is real, so F(a) is real. This establishes 
that F(s) is real on the real axis to the right of the point s = er c . In 
view of analytic continuation, therefore, F(s) is a real-valued analytic 
function. X 
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13.2 Properties of Laplace Transforms 

13.2.1 First Shifting Theorem 

In physical applications, we are sometimes required to calculate the Laplace 
transform of functions multiplied by exponential factors such as 

e- ax f(x ), 

where a is real or complex. This kind of problem can be simplified by applying 
the theorem below. 

4 The first shifting theorem: 

If F(s) = L[f(x)) for a > a c , then 

F(s + a) = L[e~ ax f(x)} 

for a > a c — Re (a) , where a is real or complex. 


Proof Suppose a c to be the abscissa of convergence for F(s). Then the integral 



e- ax f(x)e~ sx dx = 



f(x)e~ {s+a)t dx 


(13.5) 


clearly converges for Re (s + a) > a c . Observe that the integral on the right- 
hand side of (13.5) is an expression for F(s + a). Thus we have the general 
result: 


L [e~ ax f(x)] = F(s + a), 


where F(s) = L[f(x)]. A 


The above theorem states that if we know the Laplace transform of any func- 
tion, the transform of that function multiplied by an exponential can imme- 
diately be obtained by a simple shift (or translation) in the s variable. 


Examples 1. The first shifting theorem tells that 


L[e~ ax x n } = 


(s + a ) 


n+1 


, a > —a, 


since 


L[x n ] = 


=n+l ' 


er > 0. 
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2. Similarly, it follows from the first shifting theorem that 

b 


L[e aa: sin&t] = 


(s + a) 2 + b 2 


, a > —a, 


where we use the fact that 


L [sin at] = 


a 2 + a 2 


13.2.2 Second Shifting Theorem 


For the next case, assume again that a function f(x) has a transform F(s ) 
and consider a shift in the x variable from x to x — x 0 > where Xq is a positive 
constant. Stipulating that the new function be zero for x < £o> it can be 
written 

,f(x - x o )0(x - x 0 ), (13.6) 


where 


9(x) 


0, x < 0, 

1, x > 0. 


The Laplace transform of the shifted function (13.6) is thus represented by 
the integral 



x 0 )9{x 


x 0 )e sx dx = 



x 0 )e sx dx. 


Now we change the variable of integration to i! = x — xo, which gives us 

/♦O O 

L [f(x - x 0 )e{x - ®o)] = e" S3:o / f(t')e~ st ’ dx = e~ sx °F{s). 

Jo 

The result is stated formally below. 


4 The second shifting theorem: 

If F(s) = L[f(x)\ for a > a c , then 

e ~ SXo F(s) = L[f(x - x 0 )9(x - xo)] 

for a > <j c , where 9{x) is a unit step function and T is a real and positive 
constant. 


Examples Consider the Laplace transform of the function 


/ 0 ) 


0 (x < 0), 

1/a (0 < x < a), 

0 (x > a). 
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Using the step function, we express it as 

_ d ( x ) ~ K x ~ «) 

RX) ~ a 

Hence, it follows from the second shifting theorem that 

£[/<*)] = m-t-m = 

a as 

Note that in view of l’Hopital’s rule, when a — > 0, L[f(x)] = 1. The latter 
result means that the Laplace transform of f(x) equals 1. 


13.2.3 Laplace Transform of Periodic Functions 

We now consider the Laplace transform of a periodic function f{x) of period 
A, i.e., f(x + A) = f(x). Assuming that the /( x) is piecewise continuous, we 
have by definition 


/»00 

L[f(x)\ = / e~ sx f(x)dx 
J o 

p2X p3X 

= / e~ sx f(x)dx + / e~ sx f(x)dx + / e~ sx f(x)dx + 
JO J A J 2\ 


On the right-hand side, let x = u + A in the second integral, x = u + 2A in 
the third integral, and so on. We then set 

n X nX 

L[f{ x)\= e~ sx f(x)dx+ e~ s{u+x) f(u + X)du 

Jo Jo 

+ [ e- s(u+2A) /(M + 2A )du +■■■ . 

Jo 

From hypothesis, f(u + A) = /(it), f(u + 2A) = f(u), etc. Replacing the 
dummy variable u by x yields 


L[f(x)\ = (1 + e _sA + e _2sA + • • • ) f e~ sx f(x)dj 

Jo 

f e~ sx f(x)dx. 

Jo 


1 - e - jo 
Once we introduce the function 

fo{x) = 


(13.7) 


/( x), 0 < t < A, 
0, otherwise, 
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equation (13.7) becomes 


where 




/»00 n A 

Fq(s) = / e~ sx f 0 {x)dx = / e~ sx f(x)dx, 

Jo Jo 

So we have proven the following result: 

4 Laplace transform of a periodic function: 

If f(x) is a periodic function of period A, its Laplace transform is given 


by 


where 


F(t) = 

{ > 1 — e— * ’ 

Fo(s) = [ e~ sx f(x)dx. 
Jo 


(13.8) 


Examples Consider the Laplace transform of the periodic square wave de- 
scribed by f(x + 2A) = f(x) with 


f 1 (0 < x < A), 

\ -1 (\<x< 2A). 


From (13.8), we obtain 


F(s) = 


1 

1 - e~ 2sX 

1 

— 2sA 


r 2X 


e sx f(x)dx 


1 — e 


to 


X f2X' 

lx 


e sx f(x)dx 


f 1 -'-'* 2 = 1 tanh I 


( sX 


8(1 - e~ 2sX ) ~ s V 2 J ' 


13.2.4 Laplace Transform of Derivatives and Integrals 

The Laplace transform of derivatives is a most important issue in terms of 
applications for solving differential equations. We shall see below that through 
the transform, certain kinds of differential equations are reduced to algebraic 
equations that are easy to manipulate. 
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4 Laplace transform of derivatives: 


If F(s) = L[f(x)\ for <7 > a c and if 


lim e~ sx f(x) = 0 for a > a c , 

t — >-oo 

(13.9) 

then we have 


L[f\x)] = sF(s) - /( 0). 

(13.10) 


Proof Integration by parts yields 


/»oo 

L[f'{x)\ = / e~ sx f' {x)dx 

Jo 

/»oo 

= [e~ s *f(x )]~-/ (—s)e~ sx f(x)dx. (13.11) 

Jo 


The second term on the right-hand side of (13.11) converges to sF(s ) for 
a > a c . In addition, the first term reads /( 0) from the hypothesis of (13.9). 
Thus for (j > cr c , we obtain (13.10). £ 

This result can be extended to cases of higher derivatives. 

4 Laplace transform of higher derivatives: 

Suppose f(x) to be such that /(" -1 )( x ) is continuous. If F(s) = L[f(x)\ 
for a > a c and if 

lim e- sx / (fe )( x) = 0 

t—> OO 

for k — 0, 1, • • • , n — 1 and er > cr c , then 

n 

L[f< n \x)] = s n F(s ) - ^s n - fc / (fe_1) ( 0). 

fc=i 


The above theorem is central to the use of the Laplace transform for solv- 
ing differential equations with specified initial conditions (i.e. , initial value 
problems). 
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4 Laplace transform of integrals: 

If g{x) = Sq f(u)du, L[f(x)] = F(s) and if 


lim e sx g(x) = 0, 


then 


L lg(x)\ 


F(s) 

S 


Proof From hypothesis, we have g(0) = 0 and g'(s) = /(x), and thus 


L[g\x)]=L[f(x)\. 


The left-hand side becomes 

L[g’{x)\ = sL[g(x)\ - 5 ( 0 ) = sL[g(x)}. 
As a result, we obtain 


L[g(x)] = -L[f(x)]=^. * 
s s 


13.2.5 Laplace Transforms Leading to Multivalued Functions 


Some care should be taken when the Laplace transform results in a multi- 
valued function. A typical example is the transform of the function 


f(x) 



(x > 0). 


(13.12) 


Although this function has a singularity at x 
having a real integrand, 



0, the improper integral 

(13.13) 


converges for er > 0. In what follows, we first evaluate the integral (13.13) and 
then continue analytically with the result to arrive at a suitable region of 
the complex s-plane where we can get a precise form of F(s). 

The integral (13.13) can be readily evaluated by setting ax = u 2 \ then it 
reads 



(13.14) 


' 0 Jo 




Now we would like to continue analytically to take the result (13.14) to the 
complex s-plane. At first glance, it suffices to replace y/a by yfs symbolically. 
However, this is not sufficient because the function yfs is double- valued (e.g., 
when s = i = e 7 ™/ 2 , y/s may take the two distinct values: e 7 ”/ 4 and e -37 ™/ 4 ; 
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Fig. 13.3. The double- valuedness of the function yfs 


(see Fig. 13.3). Thus we have two possible choices (i.e., two sheets of Riemann 
surfaces) when performing analytic continuation from the real single-valued 
function \fo to the complex double- valued function yfs. We go into only one 
sheet of Riemann surface, the choice being the one on which the points oi \J a 
are situated [i.e., the right half of the whole s-plane, expressed by Re (s) > 0]. 
With this convention, we arrive at the result 


F(s) = L 




(13.15) 


where the symbol y/s implies the single-valued branch mentioned above. 


Remark. If the above case had been treated throughout with the variable s 
retained, the formal variable change would have led to the factor 1/y/s as in 
(13.15). However, we would not then have a clear meaning for yfs; i.e., there 
would be no way to determine which branch is to be taken. 


Exercises 


1. Show that 
and 


lim f(x) 

x—>+0 


lim f(x) 


where the Laplace integral L[f{x)} 


lim sF(s) 

s—> oo 

lim sF(s), 

s—>0 

F(s) converges for a > 0. 


Solution: Take the limits s — > oo on both sides of equation 

f'(x)e~ sx dx = sF{s) - /(0). (13.16) 

Then we have 0 = lirng^oo sF(s) — /(0+), which gives us our first re- 
sult. Moreover, in the limit s — * 0, the left-hand side of (13.16) reads 
/o°V( x)dx = Hindoo f{x) — f (0t) , so that we set our second result, ft 
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2. Find the transform of the function 

f(x) =Vt k , 

where k > 1 and is an odd integer. 


Solution: This function gives convergence for a > 0. Integration by 
parts yields a general recurrence equation: 



Vt^e 0 


k 

2 a 



V t k ~ 2 e ax dx. 


Since k > 1, the lower limit can be used in the first term on the right- 
hand side (and thus the integral exists). The result can be stated as 


L 


Vt k 


k_ 

2s 


L 


V t k ~ 2 


where k > 1 and odd. 


This yields a sequence of equations, starting with \Jt _1 , that is obtained 
from (13.15). Consequently, we have 


L 



L 



3y/7r 
4 \/s® ’ 

(A + 1)! yfi: 

2 k+1 [(fc + 1) /2] ! v / s fc+2 



In these general equations, the root of a power of s is always interpreted 
a s bein g on the sheet of the Riemann surface on which the values of 
V a k+2 are found. Jit 


13.3 Convergence Theorems for Laplace Integrals 

13.3.1 Functions of Exponential Order 

The Laplace integral is improper by virtue of an infinite limit of integration, 
as shown clearly by 

roo pR 

/ f(x)e~ sx dx = lim / f(x)e~ sx dx. (13.17) 

Jo ' Jo 

This improper integral can be identified with the Laplace transform F(s) only 
when it converges for the values of s in question. Therefore, it is important to 
clarify the conditions under which the Laplace integral converges. As a first 
step in addressing this issue, we introduce a new class of functions: 
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4 Functions of exponential order: 

A function f(x) is said to be of exponential order ao if there is a real 
number an such that 

lim f(x)e~ ax = 0 for any a > ao, (13.18) 

X — XX) 

and with the limit not existing when a < ao- 


See Fig. 13.4 for the decaying behavior of a function f{x) of exponential order 
a. Note that condition (13.18) is not necessarily satisfied at a = ao- The order 
number ao may take — oo if f{x) is identically zero beyond some finite value 
of x. 



Fig. 13.4. Decaying behavior of a function f(x) of exponential order a 


Examples 1. The function f(x) = x 3 is of exponential order zero. To see 
this, it suffices to check whether or not 

lim (e~ ax x 3 ) (13.19) 

x — »oo v ' 


exists. If a > 0, then l’Hopital’s rule gives 


lim 

x—>oo c 


X 

ax 


= lim 


= 0 . 


In contrast, when a < 0, (13.19) obviously diverges. Therefore x 3 is of 
exponential order zero. In a similar manner, it can be shown that x n for 
any integer n > 0 is of exponential order zero. 

2. The function f(x) = e cx with any real constant c is of exponential order 
c, owing to the fact that 
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lim e cx e~ ax = 0 

x—>oo 

if and only if a > c. 

13.3.2 Convergence for Exponential-Order Cases 

Suppose f(x) to be of exponential order ao- Then, we can show that the 
Laplace integral 

f{x)e~ sx dx (13.20) 

converges absolutely whenever the real component of s is located within the 
range 

Re (s) = a > cxq. (13.21) 

Since absolute convergence implies ordinary convergence, the inequality (13.21) 
serves as a sufficient condition for the Laplace integral (13.20) to converge. 
This result is formally stated by the theorem below. 

4 Theorem: (= A sufficient condition for convergence for exponential- 
order cases) 

If /( x) is of exponential order ao, then the Laplace integral 
/“ f(x)e~ sx dx converges for 

Re (s) > ao- 

(See also Fig. 13.5.) 



Proof For any a in the range of (13.21), we can pick a number ai such that 

Q!q ^ < O’. 

Since f(x) is of exponential order ao, we have 

lim f{x)e~ aiX = 0. 

X — KX) 

This implies that for any given small e > 0, we can find an appropriate X 
such that 

\f(x)\e~ aiX < £ for any x > X. 

Hence, given any small e > 0, there exists an X such that for A, A' > X, 

/*A /»A 

/ \.f(x)\ e~ ax dx = / \f{x)\e- aiX e-( a ~ ai)x dx 

JA J A 


rA' 


< £ 


-(o-o^dx. 


(13.22) 
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where the last integral in (13.22) converges to a finite value because a > a 
This means that the leftmost integral in (13.22) can be made to approach 
zero by taking X sufficiently large. Thus in view of the Cauchy’s test for 
improper integrals given in Sect. 3.4.4, the inequality (13.22) establishes 
the absolute (and thus ordinary) convergence of the integral (13.20) in the 
region Re (s) > oq. ^ 


Im.s 



Re s 


Fig. 13.5. Converging region of the Laplace integral of a function of exponential 
order ao 


Remark. The above theorem provides a sufficient condition for the ordi- 
nary convergence of the Laplace integral. Hence, a given Laplace integral 
of the function of exponential order ao must converge for Re (s) > ao, 
whereas it may or may not converge at Re (s) < ao- For example, f{x) = 
cose x gives ao = 0, but the corresponding Laplace integral converges for 
Re (s) > — 1. 


13.3.3 Uniform Convergence for Exponential-Order Cases 

Next we examine the condition for uniform convergence. Here, uniform con- 
vergence means that the improper integral (13.20) as a function of s converges 
uniformly to F(s) over the whole defining region of the s-plane. To proceed, 
let «2 be a number greater than ao and let er be in the range 


ao < «2 < a. (13.23) 

For any choice of a 2 , we can find a number ai such that 


ao < ai < a2- 
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The relation (13.22) is again valid by use of a 2 instead of a±, as expressed 
by 



e ax dx < e 



e-^-^dx. 


Furthermore, by introducing aq, we can extend this inequality to 


nA' p A! 

/ \f(x)\e~ ax dx <e / e-^-^dx. 

J A J A 

Note that the last integral converges and is independent of a. Therefore, in 

view of the Weierstrass test for improper integrals (see Sect. 3.4.4), the 

Laplace integral / 0 °° f(x)e~ sx dx converges uniformly for Re (s) > cn 2 > cto- 
We have thus proved the following theorem: 


4k Theorem: (= A sufficient condition for uniform convergence for 
exponential-order cases) 

If f(x) is of exponential order ao, then the Laplace integral 
J q 00 f(x)e~ sx dx converges uniformly to F(s) = L[f(x)\ for 

Re (s) > a 2 > ao- 

(See also Fig. 13.6.) 


Here, the constant a 2 emphasizes that the converging region guaranteed by 
this theorem is closed at the lower end. 


Im s 






0 

«0 

a 2 



Fig. 13.6. The region of uniform convergence associated with a function of 
exponential order ao 
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Remark. It is important to remember that the above theorem gives only a 
sufficient condition for convergence of the Laplace integral. In fact, it is pos- 
sible that some functions of exponential order allow their Laplace integrals 
to converge uniformly to the left of cro- 


13.3.4 Convergence for General Cases 

The previous two theorems tell a great deal about convergence of the Laplace 
integral for practical functions. On the other hand, for functions that are not 
of exponential order (but continuous within the integration interval), the 
following slightly different theorem applies. 

4 Theorem: (= A sufficient condition for convergence for general cases) 

If the improper integral 


converges for s = sq, then it converges for Re (s) > Re (so) (see also 
Fig. 13.7). 



Proof The proof requires an auxiliary function 



(13.24) 


Ims 



Res 


0 


Fig. 13.7. Converging region of the Laplace integral that converges for s = so 
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where f(x) is assumed to satisfy the conditions given above. Since f(x) is 
continuous, g(x) is also continuous and thus its derivative is given by 

g'{x) = —f(x)e~ SoX . 


In terms of <7(2;), the Laplace integral can be written as 



f(x)e sx dx = 


f(x)e~ SoX e~ wx dx = - 



g\x)e wx dx, 


(13.25) 


where we have set w = s — Sq. 

We now examine sufficient conditions for the rightmost integral in (13.25) 
to converge. Cauchy’s test for improper integrals given in Sect. 3.4.4 
says that it converges if and only if for an arbitrary small e > 0, we can find 
an X that yields 


g'(x)e~ wx dx 


< £ 


(13.26) 


with A ! , A > X. Therefore, our task is to show that the relation (13.26) holds 
for Re (s) > Re (so). 

Integration by parts gives us 



g'{x)e- wx dx = -g{A')e~ wA 


+ g{A)e~ wA 


+ w f g(x)e wx dx, (13.27) 
J A' 


which results in 



g'{x)e wx dx 


< \g{A')\e- xA ’ + \g(A)\e~ xA + \z\ / \g(x)e~ wx \ dx. 

J A' 


(13.28) 

From (13.24) and from the hypothesis given in the theorem, it follows that 
for an arbitrary small £' > 0, there exists a number X such that 


\g(x) | < s' when x > X. 


Thus, if A', A > X we have 


\g(A% \g{A)\<e>. 


In addition, if 


u = Re (w) > 0, 

then the relation (13.28) becomes 


[ g'( x)e wx dx 
J A' 

< c' 

e ~ uA ' + e ~ uA + M (e~ uA ' - e~ uA ^ 


< £ ' 1 

) ■ 


(13.29) 
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Observe that the quantity in parentheses in (13.29) is finite for any fixed value 
of w with u > 0. Therefore, by making e' small enough, the quantity 


£ = 



(13.30) 


becomes arbitrarily small; this can be the e in the relation (13.26). Conse- 
quently, the relation (13.26) holds for any u > 0, or equivalently, for any 


u = Re ( w ) = Re (s) — Re (so) > 0. 

This completes the proof of the theorem. (Note that if u = 0, the quantity 
in parentlresies in (13.29) diverges, and if u < 0, the inequality (13.29) itself 
does not hold.) £ 


Remark. The theorem is inconclusive for the convergence property on the 
line Re(.s) = Re(s 0 ) depicted on the complex s-plane. Note that we do not 
get convergence when Re (s) = a o- This means that even though the integral 
converges at a point on the line of Re (s) = (To, it does not necessarily converge 
all along the same line. A simple example is given by 

f 0, 0 < x < 1, 

/ O) = < 1 

I -, x > 1. 

v X 


The Laplace integral 


/ f(x)e~ SoX dx = 

I o J 1 


r*°° p —iojQX 


-dx = 


COS UJqX 


X 


dx — i 


sm loqX 


dx 


x 


converges for so = 0 + ico o with u>q ^ 0, but diverges at sq = 0. 


13.3.5 Uniform Convergence for General Cases 

A sufficient condition for uniform convergence is obtained in a similar way 
as in Sect. 13.3.4, although it is not the same as that for ordinary conver- 
gence. The difference is due to the fact that in the proof above, £ defined by 
(13.30) is dependent on s through |w| = |s — so|. In order to get the range of 
uniform convergence, we need a certain infinitesimal factor that can be taken 
independently of s. 

To derive such a factor, let 6 be the angle of w = s — s 0 , and observe that 
u = R,e(w) satisfies the relation 


w 


1 


u 


cos 9 
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when u > 0. If 9 is restricted to the range 

|*| <!’ 

we can find an angle 9' that satisfies 


| 0 | < e’ < 


7T 

2 ’ 


(13.31) 


or equivalently, 

M = _J_ < 1 

u cos 9 ~ cos 9’ 
Inserting this into (13.30), we have 


£ = 



< s’ 2 + 


cos 9' 


= £ , 


where the quantity e" is independent of s and becomes arbitrarily small by 
making e' small enough. This is true as far as condition (13.31) is satis- 
fied; in this context, (13.31) represents the region of uniform convergence 
of the Laplace integral. Rewriting 9 by arg(s — so), we arrive at the following 
theorem: 


6 Theorem (= A sufficient condition for uniform convergence of the 
Laplace integrals for general cases): 

If the improper integral 

f(x)e~ sx dx 

converges for s = so, then it converges uniformly to F(s) = L[f(x )] for 

|arg (s So) | < 9' < 

(See also Fig. 13.8.) 

Here, the 9' shows the closedness of the converging region. The 9' can be 
arbitrarily close to but not equal 7r/2. 
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Ims 



Fig. 13.8. The region of uniform convergence for the Laplace integral that converges 
for s = so 


13.3.6 Distinction Between Exponential-Order Cases 
and General Cases 

We have thus far presented four convergence theorems in connection with 
Laplace integrals, where the former two are associated with functions of ex- 
ponential order and the latter two are relevant to are more general functions. 
The theorems for the two cases are similar to the extent that they all identify 
a half-plane of convergence for the Laplace integral. Moreover, the general 
cases that we have considered cover a wide class of functions that includes 
exponential-order functions as a special case. At first glance, these remarks 
appear to imply that each of the former two theorems for exponential-order 
cases is a special case of each of the latter for general cases, but, this is not 
true at all. Below we give the reasons for this not being so. 

First, the theorem for ordinary convergence in the exponential-order case 
is intrinsically different from that in the general case. Observe that the for- 
mer theorem not only tells us that the Laplace integral converges in a half- 
plane; it also gives a specific number (i.e., «q) for the abscissa of a left-hand 
boundary of such a half-plane. (Of course a c < au since it gives a suffi- 
cient condition for convergence.) In contrast, the latter theorem merely states 
convergence to the right of any point at which we already know that the in- 
tegral converges; it gives no information about a boundary of the region of 
convergence. 

Second, the regions of uniform convergence are specified in a different 
manner for the two cases. Whereas the theorem for general cases tells us 
only that the Laplace integral converges uniformly in an angular sector of the 
right half-plane, the theorem for exponential-order cases indicates uniform 
convergence in a less restricted region, namely, a half-plane. 

In short, the theorems for the two cases are essentially different. As well, 
it should be emphasized again that all the four theorems provide sufficient 
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conditions for convergence of the Laplace integrals — not necessary or necessary 
and sufficient conditions. 

13.3.7 Analytic Property of Laplace Transforms 

An important consequence of uniform convergence of the Laplace integral is 
the fact that the corresponding Laplace transform, 

/»oo 

F(s)= / f(x)e~ sx dx, (13.32) 

Jo 

is an analytic function on the complex s-plane. We know that if F(s) is 
analytic, it will exist outside the range of convergence of its integral represen- 
tation, which can be uniquely determined by analytic continuation. From a 
practical viewpoint, the analyticity of F(s) plays a crucial role in evaluating 
the Laplace transform of a given function, since we can use it to treat the 
complex variable s as if it were real (see Sect. 13.1.3). We close this section 
by proving the analyticity of F(s). 

A Theorem: 

The Laplace transform F(s) is analytic in the region of uniform conver- 
gence of the corresponding Laplace integral (13.32). 


Proof We first recall that for F(s), there is a region of uniform convergence 
in the s-plane and then we perform a contour integration with respect to s 
over an arbitrary simple closed path C in this region. Owing to the uniform 
convergence property, the order of integration may be inverted so that we 
have 


F(s)ds = ^ f[x) ( 

Jc Jo \Jc 


ds dx = 0. 


which gives us zero because Cauchy’s integral formula means that 


c ds = 0. 


Since the path of C is arbitrary in the region of uniform convergence, Mor- 
era’s theorem establishes that F(s) is analytic inside the region of uniform 
convergence of its corresponding Laplace integral, ft 


13.4 Inverse Laplace Transform 

13.4.1 The Two-Sided Laplace Transform 

This section describes the inverse Laplace transformation. Intuitively un- 
derstood, the inverse Laplace transform L~ 1 [F 1 (s)] of a function F(s) is a 
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function f(x) whose Laplace transform is F(s). Nevertheless, actual oper- 
ations represented by the operator L -1 take some time to develop. To set 
to the explicit formula for manipulating the inverse transformation, we first 
introduce another kind of Laplace transform: 


4 Two-sided Laplace transform: 


If the improper integral 


roo 

/ f(x)e~ sx dx 

J — OO 

(13.33) 

exists, it is called the two-sided Laplace transform 
Laplace transform), designated by !F(s) = C[f(x)}. 

(or bilateral 


It is easy to determine the region of convergence of such an integral. Observe 
that 

/ 0 /*00 

f(x)e~ sx dx + / f{x)e~ sx dx. (13.34) 

-oo J 0 

The second integral is an ordinary Laplace integral so that it converges on a 
half-plane right to a fixed point denoted by x = tx c i. By the change of variable 
x = —u the first integral becomes 

/ 0 roc 

f(x)e~ sx dx = / f(—u)e su du. 

-oo J 0 

Here, the latter integral is also an ordinary Laplace integral, although s has 
been replaced by — s. Hence, its region of convergence is a half-plane left to 
a point, say x = er c2 . As a result, the common part of the two half-planes, 
er c i < Re(s) < <t C 2 , forms the region of convergence of the integral (13.34) as 
depicted in Fig. 13.9. 

Remark. Typically, the range of convergence of (13.34) forms a vertical strip 
with a finite interval, but may be a right half plane, a left half-plane, the 
whole s-plane, a single point, or fail to exist. 


Example We show that the function l/(s 2 + s) can be expressed as a two-sided 
Laplace integral. We readily see that 

1 11 


s(s + 1) s s + 1 ’ 
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Imj 






0C1 

0 


&C2 


Fig. 13.9. Overlapping region: the region of convergence of the two-sided Laplace 
integral 


We know that 


and 


1 

s + 1 



for a > — 1 


1 

s 



for cr > 0 


the latter and that can be rewritten as 


1 

s 



e~ sx {-l)dx 


for a < 0. 


(13.35) 


(13.36) 


From (13.35) and (13.36), we obtain 


1 _ i 1 

s(s+l) s s+1 



f(x)e sx dx , 


where 

,, , f -e _x , 0 < x < oo, 

/(x) = l -1, —oo < x < 0, 

which means that l/(s 2 + s) = £[f(x)]. The interval of convergence is seen to 
be — 1 < cr < 0. example 


13.4.2 Inverse of the Two-Sided Laplace Transform 

Having introduced the two-sided Laplace transform, we are ready to undertake 
the inverse Laplce transformation. We first observe that the two-sided Laplace 
transform 

fOO 

s ) = / f(x)e~ sx dx 


— OO 


(13.37) 
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is identical with the Fourier transform 

/ OO 

f{x)e- crx e~ iwx dx 

-OO 

if we regard the real number a as fixed. We use the inverse Fourier trans- 
formation to yield 


f{x)e~ ax = [ J~(o + iu)e iwx duj, 

2lT j - OO 

or equivalently, 

f(x) = — f T(a + iu>)e ax e iuix du;. (13.38) 

2tt J_oo 

We then replace er + ico by s, keeping in mind that s should lie on the ver- 
tical line with the abscissa Re(s) = er. Then the integral (13.38) can be 
regarded as a contour integral along the vertical line Re(s) = er. On this 
contour, 

ds = idu>, 

so the integral (13.38) becomes 

-i pa-\-ioo 

f(x ) = - — / T(s)e sx ds (Re(s) = a is fixed). (13.39) 

27r * Jc-ioo 

This result provides a clue for evaluating the explicit form of f(x) from its 
two-sided Laplace transform lF(s). 

The result (13.39) is not yet satisfying. We should recall that f(x) is 
not determined uniquely by F(s) through (13.39) unless the location of the 
point x = a is specified (see Exercise 3 in Sect. 13.4). If we know in advance 
that a lies in the region of convergence of the two-sided integral given by 
(13.37), i.e., the strip of convergence, f(x) is uniquely determined by (13.39). 
However, if a used in (13.39) is set outside this strip, the integral of (13.39) 
is altered quantitatively because the integration contour passes over one or 
more singular points of !F(s). Thus for us to be able to use equation (13.39), 
we must know the region of convergence of the Laplace integral of f(x) before 
we can fix the real number a. If only lF(s) is given, we will not be able to 
locate this region, and not be able to obtain /( x) because we will not know 
where to put a. These caveats lead to the following theorem: 


4 Theorem: 

The inverse of the two-sided Laplace transform 
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i rcr+ioo 

f{x) = o — / F(s)e~ sx ds (Re(s) = a is fixed) 

27r * Ja-ioo 

determines f(x) uniquely only if we know where a should be located. 


13.4.3 Inverse of the One-Sided Laplace Transform 

Let us develop the theory that correspond to the above for the one-sided 
transform. We compare the two-sided transform £[/( x)] and its one-sided 
counterpart L[f(x)\, where f(x) is the same function in both cases and is 
defined for all x. From the definitions of the one- and two-sided transforms, 
it is evident that 

F (s) = L[f(x)\ = C[f(x)0(x)], 
where 9(x) is the step function. This implies that 

1 pcr+ioo 

f(x)9(x) = - — / F(s)e sx ds (a is fixed). (13.40) 

27 Tl Ja-ioo 

Here, a must be to the right of all the singularities of F(s) in order for the 
integral in (13.40) to converge. As a consequence, we have arrived at the 
following theorem: 


4 The inverse Laplace transformation: 

If the function F(s) defined by 


nOO 

F(s) = / e~ sx f(x)da 

Jo 


is analytic for Re (s) > cr c , then f(x) for x > 0 is uniquely determined by 

1 ncr+iuj 

f(x) = lim — / e sx F(s)ds, 
where a is arbitrary for all a > a c . 


13.4.4 Useful Formula for Inverse Laplace Transformation 

In contrast to the situation with the inverse Fourier transformation, the use of 
the inverse Laplace transformation formula is less convenient. This is primarily 
because the calculation of the complex integral 
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pcr+ioo 


e sx F(s)ds 


can be rather complicated. In this subsection, we present a simple and natural 
method of computing integrals of this form that is based on the residue 
theorem. 

Suppose that F(s ) is analytic on the domain Re(s) > a c . We wish to 
compute 

f(x) = lim / e sx F(s)ds, x > 0. 

2m J a _ iuj 

No general method for doing this exists, but it is possible to evaluate this 
integral under certain conditions on F(s). Suppose that F(s) is analytic on the 
entire complex plane, except at a finite number of singularities si, S 2 , ■ ■ ■ ,s n 
satisfying 

Re(sj) < cr c , j = 1,2, •••,«. 

Figure 13.10 is a sketch for this situation. Let a > a c and let R > 0 be a 
real number sufficiently large that the left half-circle Cr with center s = a 
and radius R encloses all the points si,S 2 ,--- ,s n . Devide Cr into the two 
segments: 


Ir = {s £ C : s = a + iui, —R < to < R}, 
Fr = {s£C: |s — a\ = R, Re(s) < cr}. 



Fig. 13.10. A finite number of singularities Sj of F(s) enclosed by the left half-circle 
Cr composed of Th and Ir 
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By the residue theorem, 

r. n 

<L e sx F(s)ds = Res [ e sx F(s ); sj . 

Jc p 


j=i 


The right-hand side is independent of i?, if R is sufficiently large. From Cr = 
r 'r U Ir, it follows that 


<f e sx F(s)ds = [ e sx F(s)ds + [ e sx F(s)ds. 
jCr JTr Jir 


Clearly, 


lim / e sx F(s)ds = lim / e sx F(s)ds. 
M^ooJ a _ iM R^oo Ji R 

Therefore if, by chance, we have 

lim [ e sx F(s)ds = 0, 

°° Jr* 


(13.41) 


then we obtain 


1 pcr-\-iM n 

f(x) = Jim — / e sx F(s)ds = ^Res [e sa: F(s); sj . 

M—>oo 2m J„_ iM 

Unfortunately, condition (13.41) does not hold for every F. The next theorem 
presents a sufficient condition on F under which (13.41) holds true. 


4 Theorem: 

Let F be an analytic function on the complex plane except at a finite 
number of points (if they exist) and let Jr be as above. If 

lim max |F(s)| = 0, 

R —> oo s^Tr 


then 

lim [ e sx F(s)ds = 0 

Jr R 

holds for every x > 0. 


Proof This theorem is a reinterpretation of Jordan’s lemma given in 
Sect. 9.2.4. ft 


An immediate consequence of this theorem is the following: 




13.4 Inverse Laplace Transform 439 


6 Theorem: 

Let F be an analytic function on the complex plane except at a finite 
number of points Si, S2, • • • , s n , satisfying Re(sj) < a for all j. If 

lim max |F(s)| = 0, (13.42) 

R — >oo sGCr 

then the inverse Laplace transform of F(s) is given by 

n 

f(x) = Res Sj ] . (13.43) 

j = i 


13.4.5 Evaluating Inverse Transformations 

Below are several examples of actual evaluations of inverse Laplace transforms 
via the residue formula (13.43). 

Example 1. Assume a complex- valued function 


F{s) = 


1 

s 2 - 3s + 2 


that has two simple poles Si = 1 and S2 = 2 We thus choose a = 3 and set 
C R = {s: |s — 3| = i?, Re(s) < 3} 

in order to make use of equation (13.43). Before doing so, we must check that 
condition (13.42) is satisfied. Observe that 


max \F(s)\ = max 
s&C R ' se C R 


(s-l)(s- 2 ) 


If we let R = |s — 3| go to infinity, then |s — 1| and |s — 2| will also converge 
to infinity, so that 

lim max —7 — 7 = 0. 

R^oo seCjj |(s — l)(s — 2)1 

Thus (13.43) provides the desired result: 


/( x) = Res 


L(s-1)(s-2)’ 


; s = 1 


Res 


(s-l)(s- 2 )’ 


; s = 2 


e 

s — 2 


s= 1 


e 

5 — 1 


s = 2 


1-2 2-1 
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Remark. The above example can be solved more easily by rewriting F using 
partial fractions F(s) = l/(s — 2) — l/(s — 1), followed by applying known 
equations to get 

f(x) = L~ 1 [F(s)} = L- 1 



Example 2. It should be cautioned that equation (13.43) is valid only when 
the condition (13.42) is satisfied. As a negative example, let us consider the 
step function 


9{x) 


1, X > c, 

0, x < c, 


with c > 0, whose Laplace transform reads 


F{s) = L[6{x)] = e -^- (s > 0), 

We would like to derive 9{x) from F(s) through the inverse transformation 
given by 

i pcr+iM sXp—cs 

f( x ) = L ~ 1 [F(s)]= lim — ds (0 <x^c). 

M—>oo 2m J a _ iM s 


However, we cannot use (13.43) to obtain f(x), since the function e cs /s does 
not satisfy condition (13.42). In fact, if we set s = a — R, then 


max 

s€Cr 


> 


e cfi e -ccr 

W-R I 


(R 


3) 


since c > 0. 


Remark. If we were to use (13.43) in Example 2 , we would obtain a wrong 
result. The function e~ cs / s has a single simple pole at s = 0, so 


Res 



= 1 


for each value of x. This is, of course, not the step function 9(x). 


Example 3. Next we consider the inverse Laplace transformation L 1 [.F(s)] of 
the function 


F(s) = 


s + a 


(a > 0). 
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The F(s) has a first-order pole at s = —a. The residue of F(s)e sx at s = —a 
reads 


Res[F(s)e sx , -a] = lim (s + a)F(s)e sx = lim e sx = e~ ax . 

s — >—a s — > — a 

Hence, we have 

f{x)=L- 1 [F{s)]=e~ ax (x > 0). 


13.4.6 Inverse Transform of Multivalued Functions 

Some caution must be taken when considering the inverse Laplace transform of 
multivalued functions. As an example, we consider a multivalued function 
F(s) = 1 /y/s, and examine its inverse transform given by 

m = h! c 7i d ’’ (13 ' 44) 

where the symbol y/s represents values of the original double- valued function 
s 1 / 2 in the same sheet of the Riemann surface. The function 1/y/s has a 
branch point at s = 0, so among many choices we set its branch cut at 
(-oo,0]. 

Since the function 1 /y/s approaches zero as |s| — > oo, Jordan’s lemma 
is applicable. Nevertheless, the problem becomes rather complicated owing to 
the presence of the branch cut. To perform the integration of (13.44), we close 
the path f by a circle to the left, bypassing the branch cut in the manner 
shown in Fig. 13.11. No singularities are enclosed by the closed curve consising 
of C + r + 7 + C" , in which C' is the vertical line, C" is the pair of parallel 



Fig. 13.11. Closed loop employed in evaluating the integral (13.44) 
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horizontal segments, 7 is the small circle of radius <5, and f is a semicircle 
from which the infinitesimal gap at the branch cut has been omitted. Hence, 
we have 

f e sx f e sx f e sx f e sx 

/ —=ds+ / —=ds + / —=ds+ / —=ds = 0. 

Jc V s Jr V s J-r V s Jc" V s 

In the limit R — » 00 , the integral over r vanishes and the path C reduces to 
C as given in (13.44), which implies that 


1 /' e sx -1 f e sx 

f(x) = lim - — ; / —=ds = lim - — : / —=ds. 

R —> 00 27 TZ Jc ys it— >00 27 ri Jc"-\ 1-7 V s 


(13.45) 


Thus our remaineing task is to evaluate the last term in (13.45). 

Recall that is double-valued so that it is discontinuous across the 
branch cut. Consequently, on the parallel segments C" , 

y/s = i^fp and — iyfp withp=|s|, 

above and below the branch cut, respectively. On the small circle 7 , 

v/i = V5e^ /2 , 

where — 7 r < (j> < n, and ds = —dp on each of the straight lines. Thus, we have 


/ —=ds = — i 
IC"+ 7 V s 


it — ct x yfp 


(~dp) 


fit — O'! —px 


(-dp) 


j /»— 7T ^(<5 cos (fr)x gi(5 sin 


Let S go to zero and R approach infinity; then, the first two integrals on 
the right-hand side combine into a single integral. The last integral on the 
right-hand side approaches zero. As a result, we have 


lim lim / —=ds = —2 i 

S — >0 R — >00 J r'/r \/ S 


which implies that 


1 I" 00 p~p x 

f(x) = ~ — ^ dp . 

n Jo y/P 


By substituting px = u 2 , the right-hand side becomes 


~ —dp= — 
K Jo y/p ry/x 
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Eventually, we obtain 


!(x) = L " (ts) 


which is consistent with the earlier result presented in (13.15). 


Exercises 


1. Find (a) L 1 [5/(p + 2)] and (b) L 1 [l/p s ] where s > 0. 

Solution: (a) Recall that L[e ax ] = l/(p — a); hence L~ 1 [ l/(p — 

a)] = e ax . It follows that 

L- 1 UU = 5 L- 1 U-l = 5e~ 2x . 


(b) Recall that 


-sxk,„, r(k + 1) 


pOO 

L[x k ]= / e~ sx x k dx = 
Jo 


From this we have 


F(fc+l)J 


Lp* +1 J r(k + 1)' 

If we now let k + 1 = s, then 

T 1 1 r s_1 
r— 1 = ~ a 

[p s \ r( s )' * 


2. Solve the differential equation 


f"{x) + fix) = 1 


with the initial conditions 


/( 0) = /'(o) = 0 


using the Laplace transformation. 
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Solution: Taking the Laplace transform of both sides of (13.46), 
we obtain 

L[f"{x)} + L[f{x)]=L[l]. (13.47) 

Substituting the result 

L\f"(x)] = s 2 L[f(x)\ s • /( 0) - f'{ 0) = s 2 L[f(x)} 
and L[ 1] = 1/s into (13.47) yields 

s 2 L[f{x)\ + L[f{x)\ = 


i.e., 

mx)} = f(s) = 

Thus we see that 


1 

s 


s 

S 2 + 1 ' 


'1 

- L~ l 

s 

_s _ 


_s 2 + 1_ 


f{x) = L- 1 [F{s)]=L- 


1 — cos a: for x > 0, 

0 for x < 0, 

which is the solution of the initial value problem originally given 
by (13.47). X 


3. Derive the two-sided Laplace transform of the following three functions 


fa(x) 


e 2x — e x > 0, 
0, x < 0, 


fb(x) 


e 2x , x > 0, 
e~ x , x < 0, 


and 


fc{x) 


0, x > 0, 

e~ 2x — e~ x , x < 0. 


Solution: The two-sided Laplace 

£[/a(z)] = 

£[fb(x)] = 

C[fc(x )] 

n S+ 2 s+1 


1 

1 

s -|- 2 

s+1 

1 

1 

s - b 2 

s+1 

1 

1 


transform read, respectively, 
for a > —1, 

for — 2 < a < —1, 

for a < —2. 


Clearly, all the s functions are the same and may be labeled iF(s) 
(although the region of convergence is different). This implies that 
the inverse of a two-sided transform is uniquely determined only 
after the location of a is fixed. X 
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13.5 Applications in Physics and Engineering 

13.5.1 Electric Circuits I 

The most familiar applications of Laplace transformations in the physical sci- 
ences are encountered in analyses of electric circuits. Consider the RC circuit 
depicted in Fig. 13.12. The electric charge q(t) deposited in the condenser 
with capacitance C is governed by the equation 


R 


dq{t ) 
dt 


q(t ) 

c 


v(t), q(t = 0) = 0, 


(13.48) 


where R is a resistance and v(t) is the external voltage. We set a rectangular 
voltage defined by (see Fig. 13.13) 


v(t) = vq x [6{t — a) — 6(t — 6)] ( a < b) 


with the step function 


9(t — a) 


0, t < a, 

1, t > a. 


(13.49) 



Fig. 13.12. Diagram of an RC circuit 



Fig. 13.13. The time dependence of a rectangular voltage applied to the RC circuit 
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We now want to solve the differential equation (13.48) with respect to q(t). 
To do this, we apply the Laplace transform to both sides of (13.48) and make 
use of the symbol Q(s) = L[q(t)\. Straightforward calculation yields 


sQ(s) 


Q(s) v 0 ( e 


D —as 


T R 

where r = RC is called a damping time constant. Hence, we have 
v 0 1 


Q(s) = 


R S + T 1 


e~ as e~ bs 


Cv 0 (- T— i) {e~ as -e- bs ) 

\s s + r 1 / 


= Cv> 


e~ as e~ bs 


bs 


0 . 

S S 

Then we use the inverse transform to obtain 


q(t) 


L~ L [Q(s)} 


Cv o 

(0 


0(t -a)- 6{t - b) - e- (t - o)/T flit - 


9{t - a) + e- ( ‘- b)/r 6»(t - b) 


t < a, 


cvo [l — e ^ a ^ T ] a < t < b, 
cvq [e b / T — e a / T ] e~ t l T t > b. 


The explicit time-dependence of the charge q(t) given by (13.50) is illustrated 
in Fig. 13.14, in which various separations b — a are taken. 



Fig. 13.14. Time dependence of the electric charge q(t) described by (13.50), which 
is accumulated in the condenser in the RC circuit. The parameter a introduced in 
(13.49) is fixed at a = 1.0 
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13.5.2 Electric Circuits II 

Next, in order to illustrate the use of convolution integrals in applications 
of Laplace transforms, we solve the previous equation (13.48) with respect to 
the current i(t) instead of charge q(t). We consider the differential equation 

Ri(t) + — / i(u)du = v(t), (13.50) 

L Jo 

with the rectangular voltage (13.49). The integral term on the left-hand side 
in (13.50) is rewritten as a convolution integral: 

f i(u)du = f i(u)0(t — u)du = 9(t) * i{t), (13.51) 

Jo Jo 

whose Laplace transform reads 

L[0{t)*i{t)] = L[6{t))- L[i{t)] = -I{s). 

s 

Hence, applying the Laplace transformation of both sides of (13.50) yields 

RI(s) + = - (e~ as - e~ bs ) , (13.52) 

Cs s 

which implies 

Vr\ e~ as — €~^ S 

= i . -i (■ t = rc )• ( 13 - 53 ) 

R S + T 1 


•1 

0 12 3 4 

time t 



Fig. 13.15. Time dependence of the current i(t) in the RC circuit described by 
(13.54). The parameter a introduced in (13.49) is fixed at a = 1.0 
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Using the inverse transformaion, we finally set 
i(t) = L " 1 [/(*) j 

= — \e- {t ~ a)/T e{t - a) - e- {t ~ b)/T e{t - b) 

R L 

0 t < a, 

^ e o/T e" t/T a < t < b, 

R 

^ (e a/r - e fc/T ) e~ t/T t > b. 

Figure 13.15 illustrates the time dependence of the current i(t). 
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Wavelet Transformation 


Abstract Similar to the Fourier and Laplace transforms, a wavelet transform is an 
integral transform of a function by using “wavelets.” A wavelet is a mathematical 
mold with a finite-length and fast-decaying oscillating waveform, which is used to 
divide a given function into different scale components. Wavelet transforms have 
certain advantages over conventional Fourier transforms, as they can reveal the 
nature of a function in the time and frequency domains simultaneously. 


14.1 Continuous Wavelet Analyses 

14.1.1 Definition of Wavelet 


This short chapter covers the minimum ground for understanding wavelet 
analysis. The concept of wavelet originates from the study of signal analysis, 
i.e., from the need in certain cases to analyze a signal in the time and frequency 
domains simultaneously. The crucial advantage of wavelet analyses is that 
they allow us to decompose complicated information contained in a signal 
into elementary functions associated with different time scales and different 
frequencies and to reconstruct it with high precision and efficiency. In the 
following discussions, we first determine what constitutes a wavelet and then 
describe how it is used in the transformation of a signal. 

The primary question concerns the definition of a wavelet: 


Wavelet: 

A wavelet is a real- valued function ip(t) having a localized waveform 
that satisfies the following criteria: 

/ OO 

ip(t)dt = 0. 

-OO 

/ OO 

%l)(t) 2 dt = 1. 
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3. The Fourier transform ! P(u>) of ip(t) satisfies the admissibility condi- 
tion expressed by 



(14.1) 


Here, CV is called the admissibility constant, whose value depends 
on the chosen wavelet. 


We restrict our attention to real-valued wavelets, although it is possible to 
define complex-valued wavelets as well. Observe that condition 2 above says 
that if>(t) has to deviate from zero at finite intervals of t. On the other hand, 
condition 1 tells us that any deviation above zero must be canceled out by a 
deviation below zero. Hence, ip(t) must oscillate across the f-axis like a wave. 
The following are the most important two examples of wavelets: 


Examples 1. The Haar wavelet (See Fig. 14.1a): 

-1 < t < 0, 


i 

\/2 ’ 


4 V2’ 

0, 


0 < t < 1, 
otherwise. 


2. The Mexican hat wavelet (see Fig. 14.1b): 


V’O) = 


(*-£) 


_ t±) e -t 2 /(2cr 2 ) 


\[Za'K Y l A 


(14.2) 


(14.3) 


To form the Mexican hat wavelet (14.3), we start with the Gaussian func- 
tion with mean zero and variance a 2 : 



Fig. 14.1. (a) The Haar wavelet given by (14.2). (b) The Mexican hat wavelet 
given by (14.3) with a = 1 
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m = 


e -* 2 /( 2 <t 2 ) 
\f2i tct 2 


If we take the negative of the second derivative of f(t) with normalization 
for satisfying condition 2, we obtain the Mexican hat wavelet (14.3). In the 
meantime, we proceed with our argument on the basis of that wavelet by 
setting (7 = 1 and omitting the normalization constant for simplicity. 


Remark. We know that all the derivatives of the Gaussian function may be 
used as wavelets. The most appropriate one many particular case depends on 
the application. 


14.1.2 The Wavelet Transform 

In mathematical terminology, the wavelet transform is known as a con- 
volution; more precisely, it is a convolution of the wavelet function with a 
signal to be analyzed. In the convolution process, two parameters are involved 
that manipulate the function form of the wavelet. The first is the dilatation 
parameter denoted by a, which characterizes the dilation and contraction of 
the wavelet in the time domain (see Fig. 14.2a). For the Mexican hat wavelet, 
it is the distance between the center of the wavelet and its crossing of the time 
axis. The second is the translation parameter 6, which governs the move- 
ment of the wavelet along the time axis (see Fig. 14.2b). With this notation, 
shifted and dilated versions of a Mexican hat wavelet are expressed by 




t 


Fig. 14.2. Translation (a) and dilatation of a wavelet (b) 
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i 1 


t — b 
a 


1 - 


t — b 


o -[(t-b)/a ] 2 / 2 


(14.4) 


where we have set a = 1 in (14.3) and omitted the normalization factor for 
simplicity. We are now in a position to define the wavelet transform. 

4 Wavelet transform: 

The wavelet transform T(a, b) of a continuous signal x(t) with respect 
to the wavelet ip(t) is defined by 

T(a,b) = w(a) j x(t)ip ^ dt, (14.5) 

where w(a) is an appropriate weight function. 


Typically, w(a) is set to 1 /y/a because this choice yields 




t — b 


1 2 


dt = 


/ oo 
-oo 


du = 1 with u = 


t — b 


i.e., the normalization condition for the square integral of ip{t) remains invari- 
ant, which is why we use this value for the rest of this section. 

The dilated and shifted wavelet is often written more compactly as 


ipa, b{t) = A=ip 
V CL 


t — b 


so that the transform integral may be written as 

/ OO 

x(t)lpa,b(t)dt. 

-oo 


(14.6) 


From here on, we use this notation and refer to ip a ,b{t) simply as the wavelet. 


14.1.3 Correlation Between Wavelet and Signal 

Having defined the wavelet and its transform, we are ready to see how the 
transform is used as a signal analysis tool. In plain words, the wavelet trans- 
form works as a mathematical microscope, where b is the location on the time 
series being viewed and a represents the magnification at location b. 

Let us look at a simple example evaluating the wavelet transform T(a, b). 
Figures 14.3 and 14.4 show the same sinusoidal waves together with Mexican 
hat wavelets of various locations and dilations. In Fig. 14.3a, the wavelet is 
located on a segment of the signal on which a positive part of the signal 
is fairly coincidental with that of the wavelet. This results in a large positive 
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t 




b 

Fig. 14.3. (a), (b) Positional relations between the wavelet {thick) and signal 
{thin). The wavelet in (a) located at &i = 7t/2 is in phase with the signal, which 
results in a large positive value of T{a,b) at bi. The wavelet in (b) located at 
62 = — 7t/2 is out of phase with the signal, which yields a large negative value of 
T{b) at & 2 - (c) The plot of T{a = 1.0, b) as a function of b 


value of T{a , b) in (14.6). In Fig. 14.3b, the wavelet is moved to a new location 
where the wavelet and the signal are out of phase. In this case, the convolution 
expressed by (14.6) produces a large negative value of T{a , b). In between these 
two extrema, the value of T{a , b) decreases from a maximum to a minimum as 
shown in Fig. 14.3. The three figures thus clearly demonstrate how the wavelet 
transform T{a 1 b) depends on the translation parameter b of the wavelet of 
interest. 
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t 




a 


Fig. 14.4. Wavelets with a = 0.33 (a) and a = 4.0 (b), in which b = 7r/2 is fixed. 
The resulting wavelet transform T(a, b = 7r/2) as a function of a is given in (c) 


In a similar way, Fig. 14.4 a-c shows the dependence of T(a, b) on the 
dilatation parameter a. When a is quite small, the positive and negative parts 
of the wavelet are all convolved by roughly the same part of the signal x(t), 
producing a value of T(a, b) near zero (see Fig. 14.4a). Likewise, T(a, b) tends 
to zero as a becomes very large (see Fig. 14.4b), since the wavelet covers many 
positive and negatively repeating parts of the signal. These latter two results 
indicate that when the dilatation parameter a is either very small or very 
large compared with the period of the signal, the wavelet transform T{a 1 b) 
gives near-zero values. 

Figure 14.5 shows a contour plot of T(a,b) vs. a and b for a sinusoidal 
signal 

x(t) = sin t, 
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where the Mexican hat wavelet has been used. The light and shadowed regions 
indicate positive and negative magnitudes of T(a, &), respectively. The near- 
zero values of T(a, b) are evident in the plot at both large and small values 
of a. In addition, at intermediate values of a , we observe large undulations 
in T(a,b) corresponding to the sinusoidal form of the signal. This wavelike 
behavior is accounted for by referring back to Figs. 14.3a-b and 14.4a-b, 
where wavelets move in and out of phase with the signal. 

Therefore, when the wavelet matches the shape of the signal well at a 
specific scale and location, the transform value is high. On the other hand, if 
the wavelet and the signal do not correlated well, the transform value is low. 
Carrying out the process at various signal locations and for various wavelet 
scales, we can determine the correlation between the wavelet and the signal. 

Remark. In Fig. 14.5, the maxima and minima of the transform occur at an 
a scale of one quarter of the period, 7t/2, of the sine wave x(t) = sinf. This 
feature holds in general; correlation between the wavelet and the signal 

x(t) with a period p becomes a maximum at a = p/4. 


8 

6 

4 

2 

-Q 0 
-2 
-4 
-6 
-8 


Fig. 14.5. Contour plot of the wavelet transform T(a, b) of a sinusoidal wave x(t) = 
sin t 



14.1.4 Actual Application of the Wavelet Transform 

The wavelet transformation procedure can be applied to signals that have a 
more complicated wave form than a simple sinusoidal wave. Figure 14.6 shows 
a signal 


x(t) = sin t + sin 3 1 
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t ' | • | ' i l ' r 

1 2 3 4 5 6 


a 

Fig. 14.6. Wavelet transform T(a, b) of a complicated signal x(t) = sinf + sin3t 


composed of two sinusoidal waves with different frequencies. The wavelet 
transform T(a, b) of x(t) is plotted in Fig. 14.6. It is clear that the con- 
tribution from the wave with the higher-frequency oscillation appears at a 
smaller a scale. This clearly demonstrates the ability of the wavelet transform 
to decompose the original signal into its separate components. 

14.1.5 Inverse Wavelet Transform 

Similar to its Fourier counterpart, there is an inverse wavelet transforma- 
tion, that enables us to reproduce the original signal x(t) from its wavelet 
transform T(a, b). 

A Inverse wavelet transform: 

If x € L 2 (R), then / can be reconstructed by equation 

1 r°° r°° da 

x(t) = — db -T(a,b)^ b (t), (14.7) 

J - oo JO a 

where the equality holds almost everywhere. 

The proof of the equation is based on the lemma below. 

A Parseval identity for wavelet transform: 

Let Tf(a, &), T s (a, b) be the wavelet transform of f(t),g(t) £ L 2 (R ), 
respectively, associated with the wavelet ipa,b(t)- Then we have 
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J 

f°° da 
o 

poo 

1 dbTfia 1 b)T*ia,b) = C# I 
— oo J 

poo 

1 f(t)g(t)*dt. (14.8) 

— OO 


This identity is derived in Exercise 4. We are now ready to prove the inverse 
transformation (14.7). 

Proof (of the inverse wavelet transformation): Assume an arbitrary 

real function g(t) £ L 2 (R). It follows from the Parseval identity that 


C* f f{t)g(t)dt = f 
J — OO J — 


— oo 

OO 


’ — oo 
poo 


db I b ) T g{a, b) 

n & 


roc 7 poo 

db / Tf(a,b ) / gJ)ip a ,b{t)dt 

J 0 ® J — oo 

digit) 


db / 

\-J — oo •/ 0 


— oo 

b)ip atb {t) 


Since g(t) is arbitrary, the inverse equation (14.7) follows, ft 


14.1.6 Noise Reduction Technique 

Suppose that the inverse transformation equation (14.7) is rewritten as 

1 r°° r°° da 

x*(t) = 7 - db ™T(a,b)ip a , b {t), 

Cv J- oo ia* 

the integration range with respect to a in an interval [a*,oo) with a* > 0. 
Then, the result x*(t) obtained on the left-hand side deviates from the original 
signal x(t) owing to the lack of information for the scale from a = 0 to a = a*. 
In applications, this deviation property is made use of as a noise reduction 
technique. 

By way of a demonstration, Fig. 14.7a illustrates a segment of the signal 
x(t) = sin t, + sin 3 1 + R(t) 

constructed from two sinusoidal waveforms plus a local burst of noise R(t). The 
transform plot of the composite signal shows the two constituent waveforms 
at scales ai = tt/2 and 02 = 7r/6 in addition to a burst of noise around b = 5.0 
in a high-frequency region (i.e. , small a scale). 

Now we try to remove the high-frequency noise component by means of 
the following reconstruction procedure. Figure 14.7b shows a reconstruction 
of the signal where we artificially set T{a , b) = 0 for a < a*. In effect, we are 
reconstructing the signal using 

1 r°° r°° da 

x(t) = — db —T{a, b)tp a b (t), 

c# J - oo Ja* a z 
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Fig. 14.7. Noise reduction procedure through wavelet transformation, (a) A sig- 
nal x(t) = sint + sin3t + R(t) with a local burst of noise R(t). (b) The wavelet 
transform T(a,b) of the x(t). Noise reduction is accomplished through the inverse 
transformation of the T(a, b) by applying an artifical condition of T(a < a* , b) = 0. 
(c) The reconstructed signal x*(t) from the noise-reduction procedure 


i.e., over a range of scales [a*,oo). The lower integral limit, a*, is the cut-off 
scale indicated by the dotted line in Fig. 14.7b. As a result, the high-frequency 
noise component evidently reduces in the reconstructed signal as shown in 
Fig. 14.7c. This simple noise reduction method is known as scale-dependent 
thresholding. 


Exercises 

1. Show that the Fourier transform of the Haar wavelet satisfies the admissible 
condition (14.1). 

Solution: The Fourier transform of the Haar wavelet ip(t) 

is given by 
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,1/2 

<P(w) = / e~ iut dt 

Jo 


J 1/2 w/4 


Hence, we have 


l^)P 


dw = 16 


sin 4 (w/4) 


dw < oo. 4k 


2. Prove that the Fourier transform of ip a ,b{t ) yields 'P a ^(u}) = yfae ,ba; <F(aw). 

Solution: It readily follows that 

1 r°° i r°° ft — b\ 

* a , b M = ~r / e-^aAOdt = ) dt. 

V® 4-oo V a J-oo V « / 

Set u = (t — b)/ a in the last integral to obtain 
i r°° 

<P a b (uj) = —= e- iu < au+b) ip(u)adu = s/ae- ibu V(au). 4k 
V« J-oo 

3. Let ip(t) be a wavelet and <j>(t) be a real, bounded, and integrable function. 
Show that the convolution ip * cp is also a wavelet. 

Solution: We first show that ip * <p £ L 2 (R). Observe that 


‘ /»00 

[ip(t) * <^(f)] 2 = / — u)(p{u)du 

.J —oo 


= / ip(t — u) (ppu) 1 / 2 (ppuY^du 

-■J — OO 

/ oo /»oo 

ip{t — u) 2 <p(u)du / (p(u')du 1 . 

-oo J —oo 

The integral f_ (p(u')du' is a constant, denoted by A. Integrate 
both sides with respect to t to obtain 

/ oo poo r r oo 

[ip(t) * <p(t)] 2 dt < A / <^(w) / ip(t — u) 2 dt dw 

-oo J —oo U -oo 

/ OO nOO pOO 

(p{u)du / ip(t) 2 dt = A 2 ip{t) 2 dt < 

-oo J — oo J — oo 

which clearly indicates that ip * <p € L 2 (R). Next we show that the 
convolution ip * <p satisfies the admissibility condition. In fact, 


\T[ip*(p\\ 2 


|<p(w)$MI 2 


I^MP 


sup \<P(cu)\ 2 du < oo. 


These two results implys that the convolution ip*p> is a wavelet. 4k 
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4. Derive the Parseval identity for the wavelet transform (14.8). 

Solution: The transform Xy(a, 6) reads 

/ OO 1 /»oo 

= — J F(uj)y/de~ tb0J F(auj)du, 

where we used the fact that F a ^(ijj) = y/ae~ ,bu F(auj). Similarly, 

1 f°° 

we have T g (a,b) = — / G{uj)^/ae~' lbui ’F( y aio)dio. Hence, we have 

2tt d- oo 

f°° da I" 00 

/ - / dbT f (a, b)T g (a, b) 

Jo a J —oo 

da I" 00 r°° f°° r/p-^lw+w') 

4r / db dto drf ln F{uj)G{io')F{auj)F{abo') 

a- J _«*, J 7„ oc (2 tt)- 



1 

27T 


da 
a , 


dw I du}' F(Lo)G{Lo')'F(a<jj) x F(afjj')8(u> + u/) 


— oo ./ — oo 


1 /'°° do z 100 

= — / — duF(w)G(-u)*(au)V(-aw). 

2^ do ^ d— oo 

Since ^>(t) and g(f) are both real, <?(— aw) = tf'(aw)* and G(—w) = 
G*(uj). Thus we have 

J f°° dbT f (a : b)T g (a,b) = ^ f° du>F(u)G*(u) f°° 

J 0 ^ -d — oo J— oo 0 **' 


dx 


— OO 
OO 


/ OO 

f(t)g(t)dt, 

-oo 


where x = au>. This completes the proof. £ 


14.2 Discrete Wavelet Analysis 

14.2.1 Discrete Wavelet Transforms 

Having discussed the continuous wavelet transform, we move on to its discrete 
version, known as the discrete wavelet transform. In many applications, 
data are represented by a finite number of values, so it is important and often 
useful to consider the discrete version of a wavelet transform. We also can use 
an efficient numerical algorithm, called the fast wavelet transform, which 
allows us to compute the wavelet transform of the signal and its inverse quite 
efficiently. 

We begin with the definition of a discrete wavelet. In the previous sec- 
tion, the wavelet function was defined at scale a and location b as 
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, 1 f t — b\ 

Wa,b\t) = —j=ip I , 

V a \ a J 

in which the values of parameters a and b can change continuously. We now 
want to discretize the values of a and b. One possible way to sample a and b 
is to use a logarithmic discretization of the a scale and link this to the size of 
the steps taken between b locations. This kind of discretization yields 





t — nboa™ 


(14.9) 


where the integers m and n control the wavelet dilation and translation 
respectively; clq is a specified fixed dilation step parameter and bo is the lo- 
cation parameter. In the expression (14.9), the size of the translation steps, 
Ab = boa is directly proportional to the wavelet scale, a™. 

Common choices for discrete wavelet parameters a o and bo are 1/2 and 
1, respectively. This power-of-two logarithmic scaling of the dilation steps is 
known as the dyadic grid arrangement. Substituting ao = 1/2 and 6o = 1 
into (14.9), we obtain the dyadic grid wavelet represented by 

VwW = 2 m/ V(2 m i-n)- (14.10) 


Using the dyadic grid wavelet of (14.10), we arrive at the discrete wavelet 
transform of a continuous signal x(t): 


A Discrete wavelet transform: 


/ OO /»00 

x(t)ip m , n (t)dt = / x(t)2 m/2 ip (2 m t - n) dt. (14.11) 

-OO J — OO 


Remark. Note that the discrete wavelet transform (14.11) differs from the 
discretized approximation of the continuous wavelet transform given by 



OO 

X{t)lp* atb (t)dt ~ ^ X ( lAt )^*a,b( lAt ) At ' 

l = — OO 


(14.12) 


In (14.12), the integration variable t is discretized, and a and b are continuous 
whose values can be arbitrarily chosen. On the other hand, in the discrete 
wavelet transform (14.11), a and b are discretized and t remains continuous. 
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14.2.2 Complete Orthonormal Wavelets 

The fundamental question is whether the original signal x(t) can be constructed 
from the discrete wavelet transform T m , n through the relation 

OO OO 

X{t) = E E T rn,ni>m,n(t). (14.13) 


As intuitively understood, the reconstruction equation (14.13) is justified if 
the discretized wavelets il>m,n(t) are orthonormal and complete. The com- 
pleteness of ipm,n(t) implies that any function x £ L 2 (R ) can be expanded by 


OO OO 


3?(t) — ^ ^ ^ ^ (14.14) 

m =— oo n =— oo 

with appropriate expansion coefficients c m?n . Hence, the orthonormality 

/ oo 

'0m,n(^)'0rn / ,n / (t)dt = Sm,n^m' ,n' (14.15) 

-oo 

results in c m?n = T m?n in (14.14) because 

/ oo roo 00 00 

x(t)'0 m?n (t)(it = / ^ ^ ^ ^ Cm' ,n r ' t Pm' ,n' (t) 

'°° — 00 _m'=—o o n'=—o o 

OO OO /»00 

= ^ ^ ^ ^ Cm' ,n' I ' 0 m,n(^)' 0 rn / ,n' (t)dt 

/ / J — 00 

m — — 00 n — — 00 

00 00 

^ ^ ^ ^ Cm' ,n' ,n' ^m,n • 

m'=—oo n '=— 00 




In general, however, the wavelets , 0 m , n (t) given by (14.9) are neither orthonormal 
nor complete. We thus arrive at the following theorem: 

Validity of the inverse transformation formula: 

The inverse transformation formula (14.13) is valid only for a limited 
class of sets of discrete wavelets {^m,n{t)} that is endowed with both or- 
thonormality and completeness. 


The simplest example of such desired wavelets is the Haar discrete wavelet 
presented below. 


Examples The Haar discrete wavelet is defined by 

^,n(f) = 2 ro/2 ^(2 m f-n), 
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where 


fit) 


1 0 < t < 1/2, 

-1 1/2 < f < 1, 

0 otherwise. 


This wavelet is known to be orthonormal and complete; its orthonormality is 
verified in Exercise 1. 


14.2.3 Multiresolution Analysis 

We know from Sect. 14.2.2 that in order to use equation (14.13), we must find 
an appropriate set of discrete wavelets that possess both orthonormal- 

ity and completeness. In the remainder of this section, we describe a frame- 
work for constructing such discrete wavelets that is based on the concept of 

multiresolution analysis. 

Multiresolution analysis involves a particular class of a set of function 
spaces. The greatest peculiarity is that it establishes a nesting structure of 
subspaces of L 2 (R) that allows us to construct a complete orthonormal set of 
functions (i.e., an orthonormal basis) for L 2 (R). The resulting orthonormal 
basis is simply the discrete wavelet ip m ,n(t ) that yields the reconstruction 
equation (14.13). 

4 Multiresolution analysis: A multiresolution analysis involves a set 
of function spaces that consists of a sequence { Vj : j £ Z} of closed 
subspaces of L 2 (R). Here the subspaces Vj satisfy the following conditions: 

1. • • • C V„ 2 C V_i C Vo C Vi C V 2 • • • C L 2 (R). 

2. nr=-ocV, = {o}. 

3. f(t) £ Vj if and only if f(2t) € Vj+i for all integers j. 

4. There exists a function cf>(t) £ Vo such that the set {4>(t — n), n £ Z} 
is an orthonormal basis for Vo- 


The function <j>(t) introduced above is called the scaling function (or father 
wavelet). It should be emphasized that the above definition gives no informa- 
tion as to the existence of (or the way to construct) the function <j>(t) satisfying 
condition 4. However, once we find such a function we can establish a 
multiresolution analysis {Vj} by defining the function space Vo spanned by 
the orthonormal basis {<j>{t — n), n £ Z } and then forming other subspaces 
Vj ( j yf 0) successively by using the property denoted in condition 3. If this is 
achieved, we say that our scaling function (f>(t) generates the multiresolution 
analysis {Vj}. 
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Remark. There is no straightforward way to construct a scaling function <j)(t) 
or, equivalently, a multiresolution analysis {Vj}. Nevertheless, many kinds of 
scaling functions have been discovered by means of sophisticated mathemat- 
ical techniques. Here we omit the details of the derivations and just refer to 
the resulting scaling function at need. 


Examples Consider the space V m of all functions in L 2 (R ) that are constant 
in each interval [2~ m n, 2~ m (n + 1)] for all n £ Z. Obviously, the space V m 
satisfies conditions 1 3 of a multiresolution analysis. Furthermore, it is easy 
to see that the set {(j>(t — n ), n £ Z} depicted in Fig. 14.8, which is defined 
by 


<i>(t) = 


1, o < t < 1, 
0, otherwise, 


(14.16) 


satisfies condition 4. Hence, any function / G Vq can be expressed by 


OO 

n =— oo 


with appropriate constants c n . Thus, the spaces V m consist of the multireso- 
lution analysis generated by the scaling function (14.16). 


14.2.4 Orthogonal Decomposition 

The importance of a multiresolution analysis lies in its ability to construct an 
orthonormal basis (i.e., a complete orthonormal set of functions) for L 2 (R). 


<Kt- 1) 

□ . 

1 -1/2 0 1/2 1 3/2 2 

0(0 

n ., 

1 -1/2 0 1/2 1 3/2 2 

00 + 1 ) 

I^L, 

1 -1/2 0 1/2 1 3/2 2 

(a) Orthonormal basis for Vq 


< i 

i(2r-l 

n 

) 

-1 -1/2 0 

i 1/2 1 

0(20 

n 

3/2 2 

-1 -1/2 0 

i 1/2 1 

<j>(2.t + 1 

n 

3/2 2 ' 

) 

-1 -1/2 0 

i 1/2 1 

4 

3/2 2 

>(2r+2) 

n 

-1 -1/2 0 

\ 1/2 1 

3/2 2 


(b) Orthonormal basis for iq 


Fig. 14.8. Two different sets of functions: Vo and Vi 
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In order to prove this statement, we first recall that a multiresolution analysis 
{ Vj } satisfies the relation 

V 0 C Vi C V 2 C • • • C L 2 . 

We now define a space Wo as the orthogonal complement of Vo and Vi, 
which yields 

Vi = V 0 ©W 0 . (14.17) 

The space Wo we have introduced is called the wavelet space of zero order: 
the reason for the name is clarified in Sect. 14.2.5. The relation (14.17) extends 
to 

V 2 = Vi®Wi = V 0 © W 0 © Wi (14.18) 

or, more generally, it gives 

L 2 = V oo = Vo©W 0 ©Wi©W 2 ©--- , (14.19) 

where Vo is the initial space spanned by the set of functions {<j>(t — n), n £ 
Z}. Figure 14.9 illustrates the nesting structure of the spaces V, and W ? for 
different scales j. 

Since the scale of the initial space is arbitrary, it can be chosen at a higher 
resolution such as 

L 2 = V 5 © W 5 © W 6 © • • • , 
or at a lower resolution such as 

L 2 = V— 3 © W_3 © W_ 2 © • • • , 
or even at negative infinity, where (14.19) becomes 




L 2 


Fig. 14.9. Hierarchical structure of the spaces Vj and Wj as subspaces of L 2 


466 


14 Wavelet Transformation 


L 2 = ■■ ■ @ W_i ® >V 0 ® Wi © ■ • • . (14.20) 

The expression (14.20) is referred to as the orthogonal decomposition of 
the L 2 space and indicates that any function x € L 2 {R) can be decomposed 
into the infinite sum of gj £ Wp 

x(t ) = • • • + g~\(t) + go(t) + gi(t) + • ■ ■ . (14.21) 


14.2.5 Constructing an Orthonormal Basis 

Let us further examine the orthogonal property of the wavelet spaces {Wj}. 
From (14.17) and (14.18), we have 


Wo C Vi and Wi C V 2 . 

In view of the definition of the multiresolution analysis {V-,}, it follows that 

/(t)cVi «=► /(2t)cV 2 , 

so 

/(f) e W 0 <«=► /(2t)GWr. (14.22) 

Furtheremore, condition 4 in Sect. 14.2.3 results in 

/(f) € W 0 <==> f(t - n) G W 0 for any n G Z. (14.23) 

The two results (14.22) and (14.23) are ingredients for constructing the or- 
thonormal basis of L 2 (R) that we are looking for, as demonstrated 
below. 

We first assume that there exists a function that leads to an orthonor- 
mal basis {ip(t — n), n £ Z} for the space Wo- Then, if we use the notation 

V’o ,n(t) = ~ n) £ W 0 , 

it follows from (14.22) and (14.23) that its scaled version defined by 

V’l ,n(t) = v / 2V’(2t - n) 

serves as an orthonormal basis for Wi. The term a/ 2 was introduced to keep 
the normalization condition 

/ OO POO 

tpo,n{t) 2 dt = / ipi, n {t) 2 dt = 1. 

-OO J — OO 

By repeating the same procedure, we find that the function 

= 2 m/2 V’(2 m f - n) 


(14.24) 
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constitutes an orthonormal basis for the space W m . Applying these results to 
the expression (14.21), we have for any x £ L 2 (R ), 

x(t) = ■ ■ ■ + g~i(t) + go(t) + gi(t) + ■ ■ ■ 

oo oo oo 

h + E C 0 ,ni>0,n(t) + ^ Cl,„^l,„(i) H 

n =— oo n =— oo n =— oo 

oo oo 

= E E c - (14.25) 

m =— oo — oo 

Hence, the family V’ m>ra (t) represents an orthonormal basis for L 2 (R). The 
above arguments are summarized by the following theorem: 


4 Theorem: 

Let {Vj} be a multiresolution analysis and define the space Wo by Wo = 
Vi\Vo- If a function ip(t) that leads to an orthonormal basis {ip(t — ri), n € 
Z } for Wo is found, then the set of functions {ipm,n, rn , n € Z} given by 

Vw(t) = 2 m / 2 V>(2 m t-n) 
constitutes an orthonormal basis for L 2 (R). 


Emphasis is placed on the fact that since V’m,n(^ : ) is the orthonormal basis 
for L 2 (R ), the coefficients c mjn in (14.25) are identical to the discrete wavelet 
transform T m>n given by (14.11) (see Sect. 14.2.2). Therefore, the function 
ip(t) we introduce here is identified with the wavelet in the framework of 
continuous and discrete wavelet analysis, such as the Haar and the Mexican 
hat wavelets. In this sense, each W m is referred to as the wavelet space and 
the function is sometimes called the mother wavelet. 

14.2.6 Two-Scale Relations 

The preceding argument suggests that an orthonormal basis {i/’m.n} for L 2 (R) 
can be constructed by specifying the explicit function form of the mother 
wavelet Thus the remaining task is to develop a systematic way of 

determining the mother wavelet ip(t) that leads to an orthonormal basis 
{ijj(t — n) n £ Z} for the space Wo = Vi\Vo contained in a given mul- 
tiresolution analysis. We shall see that the ip(t) can be found by examin- 
ing the properties of the scaling function cj)(t ); we should recall that (j>(t) 
yields an orthonormal basis {4>(t — n) n £ Zj for the space Vo- (In this 
context, the space Vj is sometimes referred to as the scaling function 
space.) 
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In this subsection, we make reference to an important feature of the scal- 
ing function cf>(t) called the two-scale relation, which plays a key role in 
constructing the mother wavelet ip(t) of a given multiresolution analysis. We 
already know that all the functions in V m are obtained from those in Vo 
through scaling by 2 m . Applying this result to the scaling function denoted 

by 

00 ,n(i) = 0 (i -n)€V o, 

we find that 

0 m ,„(f) = 2 m / 2 </>(2 m f - n), me Z (14.26) 

is an orthonormal basis for V m . In particular, since 0 £ Vo C Vi and 0i jn (f) = 
^0(2 1 — n) is an orthonormal basis for Vi, 0(f) can be expanded by <j)\, n {t). 
This is formally stated in the following theorem: 


4 Two-scale relation: 

If the scaling function 0(f ) generates a multiresolution analysis {V, }, it 
satisfies the recurrence relation: 


oo oo 

0(f) = ^2 Pn01,n(t) = \/2 Y2 Pn0(2t-n), 

n =— oo n =— oo 


(14.27) 



(14.28) 


This recurrence equation is called the two-scale relation of 0(f) and the 
coefficients p n are called the scaling function coefficients. 


Remark. The two-scale relation is also referred to as the multiresolution 
analysis equation, the refinement equation, or the dilation equation, 

depending on the context. 


Examples Consider again the space V m of all functions in L 2 (R) that are 
constant on intervals [2 _m n, 2~ m (n + 1)] with n e Z. This multiresolution 
analysis is known to be generated by the scaling function (f>(t) of (14.16). 
Substituting (14.16) into (14.28), we obtain 

Po — Pi = and p n = 0 for n yf 0, 1. 
v 2 

Thus the two-scale relation reads 


0(f) = 0(2f) + </>(2f - 1). 
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This means that the scaling function in this case is a linear combination 
of its contracted versions as depicted in Fig. 14.10. 


o 


» 


1/2 1 


t 




0 1/2 


+ 

-I-*- t 

1 


*(2f-l) 


i t 

0 1/2 1 


Fig. 14.10. Two-scale relation of 4>(t) 


14.2.7 Constructing the Mother Wavelet 

We are now in a position to determine the mother wavelet that enables 
us to establish an orthonormal basis — n), n £ Z } for L 2 (R). Recall that 
a mother wavelet = i/jq 0 (i) resides in a space Wo spanned by the next 
subspace of the scaling function Vi, i.e., Wo C Vi. Hence, in the same context 
as in the previous subsection, ip(t) can be represented by a weighted sum of 
the shifted scaling function 4>(2t) by 

OO 

q n \/2(f)(2t — n), n£Z. (14.29) 

n =— oo 

The expansion coefficients q n are called wavelet coefficients and are given 

by 

q n = (-ir-V-n -1 (14.30) 

as stated below. 


4 Theorem: 

If {V m } is a multiresolution analysis with the scaling function 4>(t), the 
mother wavelet ip(t) is given by 

OO 

ip(t) = V 2 ^2 (-l) 7l_1 p-n-i0(2t - n), ntZ, (14.31) 

n=— oo 

where p n is the scaling function coefficient of 4>(t). 

Remember that p n in (14.31) is uniquely determined by the function form 
of the scaling function </>(t); See (14.28). Thus the above theorem states that 
the mother wavelet i/j(t) is obtained once the scaling function cf>(t) of a given 
multiresolution analysis is specified. 
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Remark. The relation q n = (— employed in equation (14.31) is one 
possible choice for constructing the mother wavelet ip(t) from the father 
wavelet In fact, there are alternative choices such as 

q n = (-l)>i_ n 

or 

q-n = ( — l) n ~ 1 p2N-l-n 

with certain N £ Z . Hence, the mother wavelet ip(t) associated with a given 
multiresolution analysis is not unique. In practice, however, any preceding 
definition of q n can be used to obtain a mother wavelet if ft) because it leads 
to an orthonormal basis for the space Wo- 


The proof of equation (14.31) requires the following two lemmas: 

4 Lemma 1: 

The Fourier transform q>(uj) of the scaling function <f>(t) satisfies 

#M = M(|)#g), 

where M(w) is the generating function of the multiresolution anal- 
ysis defined by 

1 OO 

M M =-/= E Pn e ~ in “ ( 14 - 32 ) 

v n=— oo 

with the scaling function coefficient p n of <j>(t ) . 

4 Lemma 2: 

The Fourier transform F(u) of any function / £ Wo can be expressed 

by 

FV) = V(u)e iuj/2 M* (| + tt) <P (|) , (14.33) 

where V(cu) is a 27r-periodic function, i.e. , V(u>) = V(u + 27r). 

We should keep in mind that V (u) is the only term on the right-hand side of 
(14.33) that depends on f(t); the remainder term e lu ^ 2 M*[(u/ 2) + tt]F{u/2) 
is independent of f(t). The proofs of the two lemmas are outlined in Exercises 
3 and 4. Now we turn to a proof of equation (14.31) for the construction of 
the mother wavelet from the scaling function <j>(t). 

Proof (of Theorem): Since the mother wavelet ip(t) gives an orthonormal 

basis {ip(t — n), n £ Z} for the space Wo, any function / £ Wo can be 
expressed by 

OO 

f{t) = ^2 h n ip(t - n) 


ri— — oo 
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with appropriate coefficients h n . Its Fourier transform F(lu) reads 

F(u>) = ( J2 Ke~ in A 

\n =— oo / 

where the sum in parentheses is 27r-periodic. Comparing this with (14.33), we 
obtain 

«P(w) = e iw/2 M* (|+7 r) <2> (|) . (14.34) 

Substituting expression (14.32) into (14.34) yields 

iijj j 2 oo 

= ^ p^ in[(u/2)+7r] ^Q 


V2j 


E 


n=—oo 

oo 


p n e“V (n+1)( “ /2) <Z> ((0 


V2 


E P- fc -i(-l) fe - 1 e- ifcw/2 <2>(|) [k = —n — 1]. 


k =— c 


Take the inverse Fourier transform of the both sides to find 

1 00 r°° /: )\ 

E P-k-ii-l) 1 *- 1 j_J- iku/2 e^{^)du 

c\ 00 pOO 

= -7= E P-fe-i(-l) fc_1 / e-( 2t - fe )^(u/)dc/ [u/ = w/2] 

V 2 fc=-oo 
oo 

= V2 ^ p_ fc _ 1 (-l) fc -V(2t-fc). 


k=— oo 

This is our desired result (14.31). £ 


14.2.8 Multiresolution Representation 

Through the discussions thus far, we have obtained an orthonormal basis 
consisting of scaling functions <t>j,k{t) and wavelets ipj,k{t) that span all of 
L 2 (R). Since 

L 2 = V jo © W jo 0 W jo+ 1 0 • • • , 
any function x(t) € L 2 (R) can be expanded, e.g., 

OO OO OO 

x{t) = S Jo,AjoA t ) + E E A^AA*)- (14.35) 

k =— oo k=— oo j=jo 

Here, the initial scale jo could be zero or another integer or negative infinity 
as in (14.13), where no scaling functions are used. The coefficients T)^ are 
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identified with the discrete wavelet transform given in (14.11). Often X ) ^ in 
(14.35) is called the wavelet coefficient and S h ^ is called the approxima- 
tion coefficient. 

The representation (14.35) can be simplified by using the following no- 
tation. We denote the first summation on the right-hand side of (14.35) by 

OO 

Xj 0 {t) = S io,k<!>jo,k(t). (14.36) 

k=— oo 

Equation (14.36) is called the continuous approximation of the signal x(t) 
at scale jo. Observe that the continuous approximation approaches x(t) in the 
limit of jo —> oo, since in this case L 2 = Voo. In addition, we introduce the 
notation 

OO 

*j{t) = (14.37) 

k=— oo 

where Zj(t) is known as the signal detail at scale j. With these conventions, 
we can write (14.35) as 

OO 

x(t) = x jo {t) + *?(*)• (14.38) 

J=JO 

Equation (14.38) says that the original continuous signal x(t) is expressed as 
a combination of its continuous approximation Xj 0 at an arbitrary scale index 
jo added to a succession of signal details Zj(t) from scales jo up to infinity. 

Also noteworthy is the fact that due to the nested relation of Vj+ 1 = 
Vj ® Wj, we can write 

Xj+\{t) = Xj(t) + Zj(t). (14.39) 

This indicates that if we add the signal detail at an arbitrary scale (index 
j) to the continuous approximation at the same scale, we get the signal ap- 
proximation at an increased resolution (i.e., at a smaller scale, index j + 1). 
The important relation (14.39) between continuous approximations Xj (t) and 
signal details Zj(t) is called a multiresolution representation. 


Exercises 

1. Verify the orthonormality of the Haar discrete wavelet „(t) defined by 
tWW = 2 m/2 V>(2 m f - n), where 


i>(t) = 


1 

-1 

0 


0 < t < 1 / 2 , 
1/2 <t< 1 , 
otherwise. 
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Solution: First we note that the norm of ip m , n ( x ) is unity: 

/ oo poo 

VVn(f) 2 <it = 2 -m / - fl)] 2 dt 

-oo J — oo 

/ oo 

rPmAufdu = 1 . 

-oo 


Thus, we obtain 

/ OO pOO 

ipmAWkAWt = / - n)2- k ^(2~ k t - l)dt 

-oo J — oo 


/ oo 

#u)2- fe/ V[2 m " fe (u + n) - <]di. 

-oo 

If m = k, the integral in the last line in (14.40) reads 

poo 

/ %j)(u)ll){u + n - £)dt = <5 0 ,n-<? = 


(14.40) 


since ip(u) ^ 0 in 0 < u < 1 and ip(u + n — £) ^ 0 in £ — n < u < 
£ — n + 1, so that these intervals are disjoint unless n = £. Owing 
to symmetry if m ^ k, it suffices to look at the case of m > k. Set 
r = m — k ^ 0 in (14.40) to obtain 


/ OO 

4>(u)ip(2 r v + s)dv 

-OO 


= 2 r/2 

which can be simplified as 


— OO 
1/2 


r 1/2 ri 

/ tp(2 r v + s)du — / ip(2 r v + s)du 

Jo J 1/2 


pa pb 

I = / ip(x)dx — / if>(x)dx = 0, (14.41) 

J s J a 


where 2 r it + s = x, a = s + 2 r_1 , b = s + 2 r . Observe that [s,a] con- 
tains the interval [0, 1] of the Haar wavelet ip(t), which implies that 
the first integral in (14.41) vanishes. Similarly, the second integral 
equals zero. We thus conclude that 

/ OO 

-oo 


which means that the Haar discrete wavelet ?/> m n (t) is orthonormal. 

* 
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2. Let <j> £ L 2 (R) and <P(u>) be the Fourier transform of <f>(t). Prove that 
the system {4>o,n = — n), n € Z} is orthonormal if and only if 

£~-oc l^( w + 2fc7r)| 2 = 1 almost everywhere. 

Solution: It is obvious that the Fourier transform of <fio >n (i) 

reads <Pq ,n(w) = e -OT “<£(w). In view of the Parseval identity for 
the wavelet transform (14.8), we have 


4 ) 0,n(t')4 ) 0,m(j'')d't — / 0O,o(^)0O ,m—n(t)dt 


1 poo 1 poo 

= r- / <2>o,oM<£ 0 ,m-„Mdw = — / [^o,oM] 2 dw 

J — oo «/ — oo 


oo /. 27 r(fe+l) 


27T 


E 

k=—oc 


0 —i(m—n)uj 


[d>o,oM] 2 dio 


/ 2irk 


1 f Z7T . . 

/ e -i(m-n)u J- [<P 0i0 (u,)fdu>. 

Z?T J ° k=-oo 


It thus follows from the completeness of {e OTW , n £ Z] in 
L 2 (0,27r) that <f>o,n(t)<j>o,m(t)dt = 0 if and only if 

OO 

[^o,o(w)] 2 = 1 almost everywhere. Jft 

k=— oo 


3. Let $(uj) be the Fourier transform of the scaling function <p(t) and let p n 
be its scaling function coefficient. Prove that 

1 OO 

= M (|) <P (|) with M(w) = ^ E P^ e ~ inW - ( 14 - 42 ) 

v n —— oo 

Solution: Since <j>(t) = Pn&i^t — n), we have 

00 /»00 

<£(w) = v 7 ^ p n / </>(2 1 — n)e~ zwt dt 

n=- oo 4-00 
00 /»oo 

= V2J2p™ </>(t , )e _i “ (t ' +n)/2 dt (t 7 = 2t-n) 

^ J— OO 


71 — — 00 


C2 




n=— oo 


4. Let /(t) be a function / £ W 0 = Vi\Vo for a given multiresolution analysis 
{Vj}. Prove that its Fourier transform F(w) necessarily takes the form 
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F(w) = V(cj)e^ 2 M* (1 + 7 r) <2> (|) , (14.43) 

where V (w) = V(u + 2i r). 

Solution: Since / € Wo and Vi = Vo ® Wo, it follows that 

f £ V i and is orthogonal to Vo- Hence, we can write f(t) = 
C n^i,n(t) = ^E^oo c„0(2i-n), where c„ = 
f^c<> fitful, nifydt- Take the Fourier transform of both sides to 
obtain 


F ( w ) = M f (|)^(|) with Mf(u>) = —= We _in “. 

v n=— OO 

(14.44) 


Evidently, Mf(oj) is a 27r-periodic function belonging to L 2 (0, 27r). 
Since / is orthogonal to Vo, we have f x oo F(w)<P*(uj)e mu ’dui = 0, 
so 



OO 

-F(«j + 2fc7r)^*(w + 2fc7r) 

_k——oo 


e inu du) = 0. 


Consequently, J2kL-oo F( w + 2hir)<I>*(u) + 2fc7r) = 0. Substituting 
(14.42) and (14.44) into this result, we obtain 


OO 

M f (f + kn ) M * (f + klT ) & (f + fc7r ) 


k =— oo 


= 0. 


Meanwhile we denote M/(w)M*(w) and |<£(w)| 2 by M 2 (o;) and 
<? 2 (w), respectively. By splitting the sum into even and odd integers 
k and then employing the 27r-periodicity of M(u>) and Mf(u>) [and 
thus M 2 (w)], we have 


OO 

0= M 2 (— + 2/c7T^ d>2 (— + 2fc7T^ 


k=—oo 


+ M 2 — + (2k + 1)7 r <P 2 — + (2fc + l)7r 


k=—oo 


UJ 

— + 2/c7T 


= ^ ( 
k——oo 

00 

+m 2 (^+7 t^ yy ^ 2 

k= — 00 

= m 2 © + m 2 (| + t ), 


) 


L 2 + ( 2/c + 1 ) 7r 


(14.45) 
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where we used the orthonormality condition with respect to the set 
of scaling functions {</>o,fcW}- Finally? replacing u) in the last line 
in (14.45) by 2u> gives 


MM 

—Mf(u> + 7 r) 


M* (uj + 7 r ) 

Af*(w) 


= 0, 


(14.46) 


which indicates the linear dependence of two vectors: [Mf (to). —Mf 
(lu + 7r)] and [M*(oj + n), M*(lu)]. Hence, there exists a function 
A (w) such that 

MM = A(w)M*(w + t r). (14.47) 

Since both M and Mf are 27 t periodic, so is A. Further, substituting 
(14.47) into (14.46) yields 


A(o>) + A (lu + 7 r ) = 0, (14.48) 

which means that there exists a function V (oj) such that 
A(w) = e lM V(io) and V(uj) = V(iv + 2n). 


Eventually, the results (14.44), (14.47), and (14.48) lead to the 
desired representation (14.43). £ 


14.3 Fast Wavelet Transformation 

14.3.1 Generalized Two-Scale Relations 

We know that a signal x(t) £ L 2 (R) can be represented in terms of the 
continuous approximation 5 m>ra and the discrete wavelet transform T m> „ by 

OO OO OO 

*«>= £ ^m,Q,n4 ) mo,n «)+ £ £ ^771, 71 771, 71 (^) ■) 

n=— OO 771 = — OO 71 = — OO 


where 


<^ m ,n(t) = 2 m ' 2 <j ) {‘Z m t - n) and = 2 m ^(2 m t - n). (14.49) 

[See (14.24) and (14.26).] In principle, both expansion coefficients S m . n and 
T rn n can be computed through the convolution integral defined by 

/ oo /»oo 

x{t)(j> m ,n{t)dt and T m>n = / x(t)ip m ,n(t)dt. (14.50) 

-OO J — OO 

Actual computations of these integrals are very time-consuming. However, 
there is an efficient method for computing ,S'. m . ri and T mj „ at all m, known as 
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the fast wavelet transform. This sophisticated method is based on recursive 
equations for S m>n and T rnn and thus is markedly suitable for numerical 
computations of wavelet analyses. 

To proceed with the argument, we need some preliminary results. We know 
that the father wavelet </>(f) and the mother wavelet can be described by 
a linear combination of contracted and shifted versions of cf>(t) as follows: 

OO OO 

4>(t) = V2 Y Pk<t>{2t — k) and ip(t) = Y (-l)+i_ fc 0(2f - fc), 

k——oo k=— oo 

where p n is the scaling function coefficient of For convenience, we use an 
alternative definition q n = (— l) n pi_ n of the wavelet coefficient q n instead of 
the one used in (14.30). These facts immediately result in 


0 m , n (t) = 2 m /+(2 ro f-n) =2 m ' 2 Y P+>[2(2 m f-n)-fc] 

k =— oo 
oo 

= 2 m / 2 Y Pk2- {m+1)/2 tm + l,2n+k(t) 

k= — oo 
oo 

= 2 -1 / 2 Y Pk<t>m+i,2n+k(t), ( 14 . 51 ) 

k=—oc 


and similarly, 


OO 

i>m,n(t)= 2^ 1/2 Y Qk<t>m+l, 2 n+k{t)- (14.52) 

k— — oo 

The expressions (14.51) and (14.52) are generalizations of (14.27) and (14.31) 
applicable for <f>(t) and 


4 Generalized two-scale relations: 

Given a multiresolution analysis, (/> m ,n(t) and V’m.nW are obtained from 
the set of functions {<t> m +i ,2 n+k{t)', — oo < k < oo} by 

OO 

0rn,n(^) = 2 ^ ^ ^ P/c0ra+l,2n+/c (0 ) 5 

k =— 00 
00 

VVn,n(^) = 2 ^ ^ ^ 9/2 0771+1, 2n+fc(^) • 

k =— 00 
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14.3.2 Decomposition Algorithm 

The fast wavelet transform consists of two main parts, called, respectively, 
the decomposition algorithm and the reconstruction algorithm, each 
of which gives a recursive relation between approximation coefficients ,S'. m , n 
and wavelet coefficients T m>n at neighboring scales. This subsection focuses 
on the former algorithm and in the next subsection deals with the latter. 

Remark. In the literature about the fast wavelet transform, all of the terms 
below mean the same thing: 

- discrete wavelet transform 

- decomposition/reconstruction algorithm 

- fast orthogonal wave transform 

- multiresolution algorithm 

- pyramid algorithm 

- tree algorithm 


The decomposition algorithm enables us to obtain S m>n and T mj „ at all m 
smaller than a prescribed scale mo, once S m 0j „ is given. To attain our objec- 
tive, we first derive a recursive formula for at two different scales, i.e., 
S m>n and !$V n .+i,rc From the expansion (14.49) and from the orthonormality 
of it follows that 


Sm,n — 


X(t)(t>m,n(t)dt. 

J — oo 

Using the generalized two-scale relation (14.51), we can write 


Sm.n — 


x(t) 


1 

71 




k= — oo 


dt 




— ^2 X] PkSm+l 


k =— oo 


x(j^)4>m-\-l,2n+k(t)dt 


2 n+fc • 


k =— oo 

Replacing the summation index k with k — 2 n, we obtain 


^ OO 

Sm,n = ^ , Pk—2n.Sm+l,ki 


k— — oo 


(14.53) 


which provides the approximation coefficients S m ^ n from S m + i tU . 

Similarly the wavelet coefficients T m>n can be found from the approxima- 
tion coefficients at the previous scale: 
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T — 

J- m.n — 


1 

71 


oo 

E 

k =— oo 


Qk^m-\-l,2n-\-k — 


oo 




Qk — 2n^m+l : k' 


(14.54) 


As a consequence, if we know the approximation coefficients S mo t k at a specific 
scale mo then, through repeated application of (14.53) and (14.54), we can 
generate S m ^ n and T m>n at all to < uiq. This procedure, called the decom- 
position algorithm, which is based on (14.53) and (14.54) is the first half of 
the fast wavelet transform that allows us to compute the wavelet coefficients 
efficiently, rather than computing them laboriously from the convolution of 
(14.50). 


14.3.3 Reconstruction Algorithm 

We can go in the opposite direction and reconstruct SVrc+i.n from S m ^ n and 
T m ,n- We already know from (14.39) that x m +i(t) = x m {t) + z m (t ), and we 
can expand this as 


OO OO 

•£m+ 1(^) — ^ ^ (^) T ^ ^ ^rri,n'^rn,n(j'^) • 

n =— oo n =— oo 

Furthermore, using (14.51) and (14.52), we can expand this equation in terms 
of the scaling function at the previous scale: 


CXJ 

^m+l(^) = ^ ^ Sm,n r— ^ ^ P/c0m+l,2n+fc(^) 

n=— oo v k =— oo 


T ^ ^ ^ ^ Q , fc^m+l,2n+fc(^)* 




k =— oo 


Rearranging the summation indices, we get 


^m+l(^) — ^ ^ Sm,n ^ v Pfc— 2n^m+l,fc(^) 


n —— oo 
oo 


k=—oo 


E Tm ' n y /2 E (7/c— 2n0m+l,fc(^) • 


k=—oo 


(14.55) 


We also know that we can expand x m _i(t) in terms of the approximation 
coefficients at scale m — 1, i.e., 


oo 

^m+l(^) = ^ ^ ^m+l,fc ( / ) m+l,fc(^) • (14.56) 

k =— oo 

Equating the coefficients in (14.56) with (14.55) yields the reconstruction 
algorithm: 
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oo 


^m+l,n — 




Pn—2kSm ) k 


1 

vi 


k — — oo 


Qn—2k'-^ 1 m,ki 


where we have swapped the indices k and n. Hence, at the scale in + 1, the 
approximation coefficients <SV>i+i,n can be found in terms of a combination 
of S mtn and T rn n at the next scale, m. The reconstruction algorithm is the 
second half of the fast wavelet transform. 



Part V 


Differential Equations 



15 


Ordinary Differential Equations 


Abstract The main objective of this chapter is to ensure that the reader under- 
stands the “existence theorem” (Sect. 15.2.3) and the “unique theorem” (Sect. 15.2.4) 
for a first-order ordinary differential equation. These theorems prove the existence 
and uniqueness of a solution of the differential equation and delineate the conditions 
that should be satisfied by the functions that are to be differentiated. 


15.1 Concepts of Solutions 

15.1.1 Definition of Ordinary Differential Equations 

Many physical laws are often formulated as ordinary differential equa- 
tions (ODEs) whose unknowns are functions of a single variable. Below are 
basic notation and several important theorems that are used throughout this 
chapter. We start with the formal definition of ODEs. 


4 Ordinary differential equations: 

An ordinary differential equation of order n is an equation 


F 


x,y(x),y'(x),- ■■ ,y [n \x) 


= 0 


(15.1) 


that is satisfied by the function y(x ) and its derivatives 

y'(x), y"(x), ■ ■ ■ ,y( n '(x) with respect to a single independent variable x. 


Here, the order of a differential equation means the largest positive integer n 
for which an nth derivative appears in equation (15.1). For instance, a general 
form of the first-order differential equations is given by 


F[x, y(x), y’(x)\ = 0, 


(15.2) 
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where F is a single- valued function on its arguments in some domain D. 
Hereafter we restrict our attention to the case where a: is a real number. 


Remark. 


1 . 

2 . 

3. 


An ODE (15.1) is called a linear ODE if it is linear in the unknown 
function y( x) and in all its derivatives; otherwise, it is nonlinear. 

A linear ODE of order n is said to be homogeneous if it is of the form 
a n {x)y (n) + a(n — 1 )(x)y ( ' n ~ 1 ' 1 + ... + a\{x)y' + ao(x)y = 0, where there is 
no term that contains a function of x alone. 

The term homogeneous may have a totally different meaning specifically 
when a linear ODE is first order, which occurs if the ODE is written in 
the form 




(15.3) 


Such equations can be solved in closed form by a change of variables 
u = y/x, which transforms the equation into the separable equation 


dx du 

x F(u) — u 


(15.4) 


15.1.2 Explicit Solution 

Let y = <p(x) define y as a function of x on an interval I = (a, b). We say that 
the function <p(x) is an explicit solution or a simple solution of the ODE 
(15.1) if it satisfies the equation for every x in I. In mathematical symbols, 
this definition reads as follows: 


6 Explicit solution of an ODE: 


A function y = <p(x) defined on an interval I is a solution of the ODE 

(15.1) if 



F 

x,y{x),<p'{x :),••• ,ip (n) (x) 

= 0 

for every x in I. 




Note that a real function should be a correspondence between two sets of real 
numbers. In this context, if an equation involving x and y does not define 
a real function, then it is not a solution of any ODE even if the equation 
formally satisfies the ODE. For example, the equation 


y= \/— (l + x 2 ) 


(15.5) 


does not define a real function; therefore, it is not a solution of the ODE 
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x + yy' = 0 

(15.6) 

even though the formal substitution of (15.5) into (15.6) yields 

an identity. 

Examples 1. The function 

y = log x + c, x > 0 


is a solution of y' = 1/x 

for all x > 0. 


2. The function 

2 n + 1 

•) (15-7) 

y = tan x — x, 

x^ 2 n (™ = 0,±1,±2,-- 

is a solution of 

y' = ( x + y ) 2 . 

(15.8) 


In fact, the substitution of y into (15.8) gives the identity tan 2 x = 
(x + tana; — x ) 2 = tan 2 x in each of the intervals specified in (15.7). 


Remark. Note that the ODE (15.8) is defined for all x, but its solution 
(15.7) is not defined for all x. Hence, the interval for which the function 
given by (15.7) may be a solution of (15.8) is a smaller set of the intervals 
in (15.7). 


3. The function y = \x\ is a solution of 

y' — 1 in the interval x > 0, 


and is also a solution of 

y' = — 1 in the interval x < 0. 

Remark. Observe that the function y = \x\ is defined for all x, whereas the 
corresponding ODEs are defined in only a restricted interval of x, in contrast 
to Example 2. 


15.1.3 Implicit Solution 

It is sometimes not easy (or even impossible) to solve an equation of the form 
g{x,y) = 0 for y in terms of x. However, whenever it can be shown that an 
implicit function does satisfy a given ODE on an interval /, then the relation 
g(x,y) = 0 is called an implicit solution of the ODE. A formal definition is 
given below. 
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4 Implicit solution of an ODE: 

A relation g(x,y ) = 0 is an implicit solution of an ODE 


F 


x,y{x),y'(x),--- ,y (n \x) 


= 0 


on an interval / if: 

1. There exists a function h(x) defined on I such that g(x,h(x)) = 0 for 
every x in I. 


2 . 


If F 


x, h(x), h\x), ■ ■ ■ ,h^ n \x) 


0 for every x in I. 


Remark. It must be cautioned that g(x, y) = 0 is merely an equation, and it is 
thus never a precise solution of an ODE, as only a function can be a solution 
of an ODE. What we mean in the above definition is that the function h(x) 
defined by the relation g(x, y) = 0 is the solution of the ODE. 


Examples The equation 

g{x, y) = x 2 + y 2 - 25 = 0 
is an implicit solution of the ODE 

F{x,y,y') = yy' + x = 0 

on the interval I : —5 < x < 5. In fact, the function h(x) = s/2h — x 2 defined 
on I yields 

F[x, h(x), h'(x)] = \/25-x 2 ^--^=L==^ + x = 0 
for every x on I. 


15.1.4 General and Particular Solutions 

We next observe that an ODE in general has many solutions. For example, 
the ODE 

y' = e x 

can be solved as 

y = e x + c, (15.9) 

where c can take any numerical value. Similarly, if 

y = e x , (15.10) 

then its solution, obtained by integrating three times, is 


y — e x + cix + cix + C3, 


(15.11) 
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where c 1 , 02,03 can take on any numerical values. Note that both (15.9) and 
(15.11) express infinitely many solutions since, which are constants, the c’s 
can have infinitely many values. Figure 15.1 is a geometrical interpretation of 
this point. Each curve corresponds to a solution (15.11) for C 2 = —5,1,4 and 
C 3 = — 2 , 1 , 3 while Ci = 1 is fixed. 



x 

Fig. 15.1. Family of the infinitely many solutions (15.11) of the differential equation 
(15.10). Solid and dotted curves correspond to C 2 = 1 and C 2 = 3, respectively 


The two examples above illustrate that solutions of an ODE may often 
be represented by a single equation involving an arbitrary constant c. Such 
a function involving an arbitary constant is called a general solution (or 
complete integral or primitive integral) of an ODE. Geometrically, these 
are infinitely many curves, one for each set of values of the c’s. If we choose 
specific values of the c’s, we obtain what is called a particular solution of 
that ODE. 

Remark. From the examples above, the reader might assume that 

(i) an ODE always has infinitely many solutions, or that 
(ii) a solution of an nth order ODE always contains n arbitrary constants. 

However, these two conjectures are false. For instance, 

• The equation ( y ") 2 + y 2 = 0 has only one solution y = 0 that possesses 
no arbitrary constant. 

• The equation \y'\ + 1 = 0 has no solution. 

• The first- order equation (j/ — y)(y' — 2 y) = 0 has the solution (y — 
C\e x ){y — C 2 e 2x ) = 0 that has two (not one) arbitrary constants. 
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15.1.5 Singular Solution 

Consider an ODE of the form 


y-xy' = f(y’), (15.12) 

which is known as a Clairaut equation. We solve it by differentiating both 
sides to yield 

y" [f'(y') + x\ = o. 

We thus have two possibilities. If we set y" = 0, then y = ax + b so that 
substitution back into the original equation (15.12) gives b = f(a). Thus we 
have a general solution: 

y = ax + f(a), 

where a is an arbitrary constant. On the other hand, if we set 

f'(y') + x = 0, (15.13) 

then eliminating y' between (15.13) and the original equation gives us a so- 
lution with no arbitrary constant, which is known as a singular solution. 
There are various other types of singular solutions, one of which is given below. 

Examples Suppose the Clairaut equation to be of the form 

y = xy'+ ( y ') 2 

and differentiate both sides to obtain 


y"{x + 2y') = 0. 


If we set y" =0, then the general solution reads 

y = cx + c 2 (15.14) 

with an arbitrary constant c. However, if we choose the possibility that 2 y' + 
x = 0, then we have 

x 2 + Ay = 0. (15.15) 


Remark. Geometrically, the singular solution (15.14) is an envelope of the 
family of integral curves defined by the general solution (15.15), as depicted 
in Fig. 15.2. The dotted parabola is the singular solution and the straight 
lines tangent to the parabola are the general solution. 
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Fig. 15.2. The singular solution (15.14) is an envelope of the family of integral 
curves (see Sect. 15.1.6), which are defined by the general solution (15.15) 


15.1.6 Integral Curve and Direction Field 

Before closing this section, we must emphasize the geometric significance of 
a solution of a first-order ODE. In many practical problems, a rough geomet- 
rical approximation to a solution may be all that is needed rather then an 
evaluation of its explicit functional form. Let 

V = f(x) or g(x,y) = 0 

define a function of x whose derivative y' exists on an interval I : a < x < b. 
Then y' gives the direction of the tangent to the curve at each of these points. 
Therefore, finding a solution for 

y' = F(x,y), a<x<b (15.16) 

can be reduced to finding a curve on the (a’-y)-plane whose slope at each of 
its points is given by (15.16). The relevant terminology is given below. 


4 Integral curve: 

If a curve y = /( x) [or g(x, y) = 0] satisfies a first-order ODE (15.16) on 
an interval /, then the graph of this function is called an integral curve. 
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Obviously, an integral curve is the graph of a function that is a solution of 
a first-order ODE (15.16). Therefore, even if we cannot find an elementary 
function that is a solution of (15.16), we can draw a small line element at 
any point on the (a;-y)-plane for which x is in I to represent the slope of an 
integral curve. If this line is short enough, the curve itself over that length 
resembles the line. These lines are called line elements and an ensemble of 
such lines is called a direction field. 


Exercises 

1. Test whether the relation 


xy 2 — e~ y — 1 = 0 (15.17) 

is an implicit solution of the ODE 

(xy 2 + 2 xy — l) y' + y 2 = 0. (15.18) 

Solution: If we blindly differentiate both sides to yield 

2 xyy' + y 2 + e~ v y' = 0 (2xy + e~ v ) y' + y 2 = 0 

and then eliminate e~ y from the final result by using (15.17), we 
obtain the ODE (15.18). This implies the possibility that (15.17) 
is an implicit solution of the ODE (15.18). The remaining task is, 
therefore, to determine the interval I on which we can define such 
a function y = h{x) that satisfies the relation (15.17) for every x 
on I. 

As a first step, we write (15.17) as 

, /l + e-y 

y = ± \^~' 

which says that y is defined only for x > 0 since e~ v is always 
positive. Hence, the interval for which (15.17) may be a solution 
of (15.18) must exclude values of x < 0. 

Next, we depict a graph of equation (15.17) on the (x-y)-plane 
(see Fig. 15.3). From the graph, we see that there are three choices 
for the function y = h(x), each of which gives a one-to-one relation 
between x and y. If we choose the upper branch (y > 0), then we 
can say that “(15.17) is an implicit solution of (15.18) for all x > 

0.” If we choose either of the two lower branches (one is above the 
dashed line and the other is below), then we can say that “(15.17) 
is an implicit solution of (15.18) only for x > x 0 ~ 2.07.” £ 
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Fig. 15.3. The curve of the function (15.17) 


15.2 Existence Theorem for the First-Order ODE 

15.2.1 Picard Method 

In this section, we consider a first-order ODE of the form 

y\ x ) = f(x,y(x)), (15.19) 

where f is some continuous function. Our main purpose is to prove that: 

(i) a wide class of equations of the form (15.19) have solutions, and 

(ii) solutions to initial value problems 

y'{x) = f(x,y(x)), y(x 0 ) = yo 

are unique. Statements (i) and (ii) are supported by the existence theo- 
rem and the uniqueness theorem, respectively, as is demonstrated in the 
subsequent subsections. 

Our proof of the two theorems is based on that we call Picard’s method, 
which gives solutions of an initial value problem 

v'{x) = /( x,y(x) ) , y(x 0 ) = yo, (15.20) 

where / ( x, y(x) ) is assumed to be continuous and real-valued in a rectangle: 
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R : \x — x 0 | < a, \y - y 0 \ < b (a,b> 0). (15.21) 

The key to Picard’s method is to replace the differential equation in (15.20) 
by the equivalent integral form, 

y(x) = y 0 +f f(t,y(t))dt, (15.22) 

J Xq 

which is an integral equation because the unknown function y{x) ap- 
pears in the integrand. That the integral equation (15.22) is equivalent to 
the original initial value problem can be checked by differentiating (15.22) 
on x. 


Remark. Note that the initial condition y(x o) = yo is automatically included 
in (15.22). 


We now try to solve (15.22). As a crude approximation to a solution, we take 
the constant function ipo(x) = yen which clearly satisfies the initial condition 

<p o(xo) = yo, 

whereas it does not satisfy (15.22) in general. Nevertheless, if we substitute 
the constant function into f(t,y(t)) of (15.22), we have 

<Pi(x) = y 0 +f f(t,ip 0 (t))dt, (15.23) 

J X 0 

which is a closer approximation to a solution than ipo(x). By continuing the 
process, we have a sequence of functions {ip n (x)}: 

4 Successive approximation: 

Given an integral equation (15.22) with respect to y(x), a set of functions 
defined by 

<Po(x) = Vo, 

<Pn{x) = yo + f f (t,<p n -i(t))dt. (n= 1,2, •••) (15.24) 

J Xo 

is called a successive approximation to a solution of (15.22). 


We understand intuitively that taking the limit n — > oo yields 


ip n (x) -> <p(x), 
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where <p(x) is the exact solution of the integral equation (15.22). The 
convergence property of the sequence {ipi(x)} and the equivalence of 
the limit function <p(x) to the solution of (15.22) are guaranteed if the 
integrand f(x,y(x )) satisfies several conditions as is demonstrated in 
Sect. 15.2.3. 

In summary, we now know the following: 

Picard method: 

The differential equation y'(x) = f(x 1 y( x)) for a given initial value 
y( X q) = yo can be solved by starting with ifio(x) = yo and then computing 
successive approximations (15.24). The process converges to a solution of 
the differential equation, where f(x, y) satisfies several specific conditions 
given in Sect. 15.2.3. 


15.2.2 Properties of Successive Approximations 

We have previously assumed that f(x, y(x)) is continuous in the rectangle R 
defined in (15.21). Hereafter, we further assume that f(x , y(x)) is bounded on 
R , which means the existence of a constant M > 0 such that 

\f( x ,y( x ))\ < M for all (x,y) G R. 

In this case, the successive approximations {(p n (x)} show both the continuity 
and boundedness property stated below. 

4k Continuity of successive approximations: 

Let f(x,y) be continuous and bounded by \f(x,y)\ < M in a rectangle 

R : \x-x 0 \ <a, \y-yo\<b (a,b> 0). 

Then, the successive approximations ip n ( x ) are continuous on the interval 

I \ \x — Xq\ < c = min 

4k Boundedness of successive approximations: 

Under the same conditions as above, the <p n (x) satisfy the inequality 

\Vn{x) ^ Vo\ < M\ x - x 0 | 

for all x in I. 
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Remark. The condition \f(x,y)\ < M has an important geometric meaning 
in terms of the direction field. Since if = f(x,y), the direction field y' is 
bounded as \y'\ < M, namely, —M<y'< M for all points in R. Therefore, 
a solution curve <p(x) that passes through (x 0 , y o) must lie in the shadowed 
region in Fig. 15.4. 


Proof (of the continuity). From (15.23), we have 


Wi{x) - j/o I = 


rx px 

/ f(t,<p 0 (t))dt < / |/( t,y 0 )| dt < M \x - x 0 | , (15.25) 

J Xf\ J Xo 


since <po(t) = yo and |/(x,t/o)| < M. Now we tentatively assume that the 
theorem is true for a function ip n with n > 1, and then prove inductively 
that it is also true for <p n . By hypothesis, all points (t, ip n _i(t)) for t in / are 
located within R. Hence, the function 


= /(i,¥> n _i(i)) 


exists for t in /, which implies that 

Pn{x) = Vo + / F„-i{t)dt 

J Xo 

exists as a continuous function on I. X 


Proof (of the boundedness). Since by hypothesis 

\F n -i(t)\ = |/ (t, </?„_ i(t))| < M, 

we have 


I Pn{x) - 2/0 1 < 


F n ~ \ {t)dt 


< 


f \F n _i(t)\ dt < M\x — x 0 |. 

J xc\ 


Therefore, the boundedness of ip n (x) has been proved by induction. X 




Fig. 15.4. Continuity and boundedness of a solution curve ip(x) on the interval 
I \ \x — xo\ < c = min[a, '); ] 
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15.2.3 Existence Theorem and Lipschitz Condition 

Let f{x,y{ x)) be a function defined for (x,y) in the rectangle R in the 
(a;-y)-plane. We would like to verify the existence of solutions for the first- 
order ODEs expressed by 


y'(x) = f(x,y(x)) 

by imposing a Lipschitz condition: 

4 Lipschitz condition: 

We say that f(x, y(x)) satisfies a Lipschitz condition on a region R if 
there exists a constant K > 0 such that 

\f(x,y(x)) - f(x,z(x )) | < K\y(x) - z(x)\ (15.26) 

for all (x,y),(x, z) £ R. Here the positive constant I\ is called the Lips- 
chitz constant. 


Our most important theorem is presented below. 


4 Existence theorem: 

Suppose that 

1. f(x,y) is continuous and real- valued on the rectangle R. 

2. \f(x,y)\ < M for all (x,y) in R. 

3. / satisfies a Lipshitz condition with constant K in R. 
Then the initial value problem 


y’{x) = f(x, y(x)), y(x 0 ) = y 0 
has at least one solution y(x) in the interval 


I : \x 


Xq | < c = min 



(15.27) 


Proof Consider the successive approximations {tp n (x)} to a solution of the 
initial value problem (15.27), wherein f(x,y(x)) is assumed to satisfy the 
Lipshitz condition (15.26). We would like to prove that (i) the limit function 

ip(x) = lim ifin(x) 

n — >oo 


exists and (ii) that it is the solution of (15.27). 
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By definition of ip n (x), for n > 1, we obtain 


\Vn+i{x) - <p n (x ) | < 


/ (t, <Pn(t)) - f {t, tp n -l(t)) } dt 


< f \f(t,<p n (t))-f(t,<p n -l(t))\dt 

J Xq 

— K f \<Pn(t) - <Pn-l(t)\dt. (15.28) 

j Xn 


Set n = 1 in (15.28) and substitute it in the result (15.25) to find 

|2 


1 ^ 2 ( 2 ;) - < KM 


\x - Xq\ 
2 ! 


(15.29) 


Set n = 2 in (15.28) and use the result of (15.29) in the last term in (15.28). 
Continuing the process, we have 

\Mx) - <Pn- 1(*)| < K n ~ 1 M^——^—. (15.30) 

n! 

Observe that the right-hand side of (15.30) is the nth term of the power series 
for e K \ x ~ x o\ multiplied by M/K. This implies that the infinite series 


OO 

< ) + ^2 [^(x) “ Vk- 1(®)] (15.31) 

k = 1 


is absolutely (and thus ordinary) convergent, ensuring the existence of the 
limit function ip(x) = linin^oo ip n (x). (See Sect. 3.2 for the convergence prop- 
erties of Cauchy sequences.) 

Next we prove statement (ii) above. Note that the nth partial sum of 
(15.31) is just (p n (x) and that the infinite series (15.31) equals the limit func- 
tion ip(x). Hence, we have from (15.30) and (15.31) that 


W{x) - W n {x)\ = 


^2 [Vk{x) - Vk-l{x)} 


k=n -\- 1 
00 


00 


< ^2 \Vk{x) - <Pk-l{x)\ < ^2 Kk lM 


k=n -\- 1 
00 


k=n -\- 1 


< ^ K k Hi 0 - < ^a n e 


k=n + 1 
Kc 


\x - Xo\ 
k\ 


_ ( Kc) n+1 
(n + 1)! ' 


where 
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Since a n is the nth term of the power series of e Kc , we have lirn,,^^ a n = 0. 
Therefore, the series of functions {(p n (x)} converges uniformly to <p(x) in 
the interval I : x £ [x$ — c, Xq + c] , which means that 


lim f (x,ip n (x)) = f(x,ip(x)). (15.32) 

n — »oo 


That being so, we can write 

<p(x) = lim <p n (x) 

n—> oo 


2/o+ lim / f(t,ip n (t))dt 
n ~* 00 J X q 


= 2/o + 
= 2/o + 



lim / (t,<p n (t))dt 




(15.33) 


J Xq 

By differentiating on x, we have 

= f(x,(fi(x)) , <p(x 0 ) = y 0 - 

These ensure that ip(x) is a solution of our initial value problem (15.27). Jit 


15.2.4 Uniqueness Theorem 

Next we examine the uniqueness of the solution ip(x) that we found earlier 
using the Picard approximation method (see Sect. 15.2.1). This is described 
by the theorem below. 

4 Uniqueness theorem: 

Let f(x,y) be continuous and satisfy the Lipschitz condition (15.26) in 
the rectangle R. If ip and if) are two solutions of the initial value problem 
(15.27) in an interval I containing xq, then ip(x) = ip(x) for all x in I. 


Proof We assume that both ip(x ) and ip(x) are solutions of (15.27). For x > xq, 
we have from (15.33) and the Lipschitz condition (15.26) that 

\tp(x) - ip(x)\ < f dt 

J X 0 

<K f \<p(t) — ip(t)\ dt. (15.34) 

J Xo 

This holds in the interval I : x G [x$, Xq + <5] for arbitrary small 6 > 0. Since 
\p(x) — ip(x ) | is continuous in I, it has a maximum at some x on /, which we 
label /i. Equation (15.34) provides that 
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y < K n \x — ®o| < KyS for all x in /, (15.35) 


so we have 

(1 - K5)n < 0. 

Note that by definition y, > 0. Hence, if KS < 1, we have y = 0, which says 
that given any Lipshitz constant K , we can find a sufficiently small S such 
that 

max | <p(x) — ip(x) | = 0, 


i.e., 


\<p{x) — ip( x )\ — 0 for x € [xo> + £]• 


Continuing this process yields the conclusion that | tp(x) — ip{x)\ = 0 for all x 
in R. The same holds for the case x < Xq, completing the proof. 


15.2.5 Remarks on the Two Theorems 

1. The existence and uniqueness theorems only ensure the existence and 
uniqueness of a solution. They do not tell us whether the solution can or 
cannot be expressed in terms of an elementary function form or help us 
to find the solution. 

2 . Arguments for real- valued functions given thus far are straightforwardly 
extended to the case that / is complex- valued. In this case we must admit 
complex- valued solutions and / must be defined for complex z. The set 
of points 2 satisfying \z — Zo\ < b becomes a circle with a center zo and 
radius b, so domain R is no longer a rectangle. 

3. The initial value problem 

y' 0*0 = \J\y{x)l y(0) = o, 


has two solutions, 

■y(x) = 0 and y{x) 


x 2 /4 if x > 0, 
— x 2 /4 if x < 0, 


although f(x, y) = \f\y\ is continuous for all y. The Lipshitz condition is 
violated in any region that includes the line y = 0 because for yi = 0 and 
positive 7/2 we have 


l/(z,2/2) - f(x,yi)\ 

1 2/2 - 2/1 1 


Vn 

2/2 


1 

y/V2 


(Vyi > °) 


(15.36) 


and this can be made as large as we please by choosing y 2 sufficietly small, 
whereas the Lipshitz condition requires that the quotient on the left-hand 
side of (15.36) not exceed a fixed constant M. 
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Exercises 

1. Using the Picard method, evaluate the successive approximation to the 
solution of the initial value problem 

y'(x) = l + y{xf, 2/(0) = 0. 


Solution: Set £o = 0, t/o = 0, f(x 7 y) = 1 + y 2 in (15.24) to find 

that 

Wn{x) = J jl + [</?„_i(f )] 2 j dt = x + J [<p n _ i(f )] 2 dt. 

Hence, we obtain 


fX rx x 3 

y>i(x) = x + / 0 dt — x, (fi 2 (x) = x + / t 2 dt = x+—, 

Jo Jo 3 

, , r ( t 3 \ 2 , x 3 2 5 i 7 ^ 

<p 3 (x) =x+ I t + — at = x + — + —x + — x , and so on. £ 


Remark. The exact solution of the above problem can be deduced by sepa- 
rating variables: 


, , x 3 2 5 17 7 / 7r 7 r\ 

„(x)=ta„x = x +y + -x +— * +-. 

The first three terms of <fi 3 (x) and of the series above are the same. The 
series converges only for |x| < 7t/2; therefore, all that we can expect is that 
our sequence converges to a function that is the solution of our 

problem for \x\ < n/2. 


2. By applying the Picard method to 

y\x) = xy(x), 2/(0) = 1, (15.37) 

show that the Picard series {ip n (x)} converges absolutely and uniformly. 
Solution: The integral equation corresponding to (15.37) becomes 

y(x) = 1 + [ ty(t)dt. 

Jo 

The iterative equation is written as <po(x) = 0 and 

<Pn+i(x) = 1 + / t<p n (t)dt, (n = 1, 2 • • • ). 

Jo 
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Thus, we easily find 


x 2 

‘ Pn(x ) = 1 + + 


1 

2 ! 



1 

k! 



The nature of the convergence is obvious for all real x, since it is 
a partial sum for the Taylor series of the function <p(x) = e x / 2 . 
This means that (p n (x) — » <p(x) as n — > oo. ^ 

3. For the equation given by 

y'(x) = 2 y(x) 1/2 , y( 0) = 0, 


check the uniqueness of the solution in connection with the Lipschitz condi- 
tion. 

Solution: This equation has the two solutions y(x) = 0, y{x) = 

16a: 2 , although f(x,y) = 2{y) 1 / 2 is continuous for all y. The Lips- 
chitz condition (15.26) is violated in any region that includes the 
line y{x) = 0 because for yi = 0 and y 2 ^ 0 we have 

I/O, 1 / 2 ) - f(x,y ) | _ yO/ _ J_ 

\y2-yi\ 2/2 s/m 

which diverges for yi — > 0, exceeding a fixed constant K . X 


15.3 Sturm— Liouville Problems 


15.3.1 Sturm— Liouville Equation 


ODEs encountered in physics are often classified as Sturm— Liouville equa- 
tions: 


4 Sturm— Liouville equation: 

A Sturm-Liouville equation is a second-order homogeneous linear ODE 
of the form 


d \ , sdy 
Tx P(I) * 


+ q(x)y + Xw(x)y = 0, 


(15.38) 


where A is a parameter and p, q, w are real-valued continuous functions 
with p(x) > 0 and w(x) > 0. Here w(x) is called a weight function. 


Using the Sturm-Liouville operator L defined by 


L = 


1 


w(x) I dx 


~4 ; IpWt ) +q(x) 


dx 


(15.39) 
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we reduce the Sturm-Liouville equation (15.38) to the abbreviated form 

Ly(x) = -Xy(x). (15.40) 

Examples The Legendre equation 

(l — x 2 ) y" — 2 xy' + n(n + 1 )y = 0, n > 1, x £ [—1, 1] 
is expressed as 

[(l - x 2 ) y'] ' + n(n + 1 )y = 0. 

This is in the Sturm-Liouville form of p = 1 — x 2 , q = 0, w = 1, and A = 
n(n + 1). 

Relevant terminology is given below. 

Sturm-Liouville system: 

A Sturm-Liouville system consists of a Sturm-Liouville equation 
(15.38) on a finite closed interval a < x < b, together with two separated 
boundary conditions of the form 

■y(a) = ay' {a) and y(b) = j3y'{b) 

with a, j3 being real. 


A nontrivial solution of a Sturm-Liouville system is called an eigenfunction 
and the corresponding A is called an eigenvalue. The set of all eigenvalues 
of a Sturm-Liouville system is called the spectrum of the system. 

Examples The Sturm-Liouville system consisting of the ODE 

y" + Xy = 0 0 < x < ir 


with the separated boundary conditions 


has the eigenfunction 
and the eigenvalues 


2/(0) = 0, y( tt) = 0 
y n (x) = sinna; 

A n = n 2 . (n = 1,2, • • •). 


15.3.2 Conversion into a Sturm-Liouville Equation 

Mathematically, Sturm-Liouville equations represent only a small fraction of 
the second-order differential equations. Nevertheless, any second-order equa- 
tion of the form 


a(x)y" + b(x)y' + c{x)y + A e(x)y = 0 
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can be transformed into a Sturm-Liouville equation by multiplying the factor 

' b(s) — a'(s) 


£(x) = exp 


a(s) 


-ds 


(15.41) 


which yields a Sturm-Liouville form, 

{day')' + t,cy + Hey = o, 

with a nonnegative weight function ^(x)a{x). 

Examples We show below that the Hermite equation of the form 

y" - 2xy' + 2ay = 0 (15.42) 

can be transformed into a Sturm-Liouville equation. Substituting a(x) = 1 
and b(x) = —2x into (15.41) yields 


£(z) = exp 


(—2 s)ds 


by which we multiplying both sides of (15.42), to obtain 

e~ x y" — 2xe~ x y' + 2ae~ x y = ^ e~ x y'^j + 2ae~ x y = 0, 
This is the Sturm-Liouville form with 

p(x) = e~ x , q(x) = 0, w(x) = e~ x , 

and A = 2a. 


15.3.3 Self-adjoint Operators 

We know many facts about Sturm-Liouville problems. Below is an important 
concept regarding the nature of these problems. 

4 Adjoint operator: 

The adjoint of an operator L, denoted by ZA, is defined by 

■ b 1 * 
g*(x)[H f(x)\p(x)dx\ . (15.43) 


Using inner product notation, we can write the definition of the adjoint 
operator (15.43) as 
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The most important terminology in this section is given below. 

4 Self-adjoint operator: 

An operator L is called self-adjoint (or Hermitian) if 

L = L ] 

or, in inner product notation, 

(/, Lg) = (g, Lf)* . 


It should be noted that an operator is said to be self-adjoint only if certain 
boundary conditions are met by the functions / and g on which it acts. An 
illusrative example follows: 


Examples Let us derive the required boundary conditions for the linear oper- 
ator 


L = 


dd_ 
dx 2 


to be self-adjoint over the interval [a, b\. From the definition of self-adjoint 
operators, the operator L should satisfy the relation: 




(15.44) 


Through integration by parts, the left-hand side gives 



' f* — 
dx 


<r 

1 dx 


i b 


d 2 r 

1 dx 2 


dx. 


(15.45) 


From a comparison as (15.44) and (15.45), it follows that the operator L is 
Hermitian provided that 


~ f *dg~ 

b 

\ df* 1 

V dii 

a 

[ 9 te \ 


15.3.4 Required Boundary Condition 

In the example in Sect. 15.3.3, we derived the required boundary condition 
for a specific Sturm-Liouville operator to be self-adjoint. For general Sturm- 
Liouville operators, such a required boundary condition is given by the fol- 
lowing theorem. 
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4 Theorem: 

A Sturm-Liouville operator is self-adjoint on [a, b\ if any two eigenfunc- 
tions yi and yj of (15.38) satisfy the boundary condition 

[vy*y' 3 ] b a = o. (15.46) 


Proof It follows from the explicit form of the Sturm-Liouville operator L that 

1 rb i pb 

(: Vi,Ly j ) = / y* (py'j)' dx - - y*qVjdx. (15.47) 

The first integral is integrated by parts to give 

- - [y* py'j] b a + - [ ( yt )' py'j dx, 

W w J a J 

in which the first term vanishes because we have assumed the boundary con- 
dition (15.46). Integration by parts then yields 

^ [(*/<)' PVj] b a~^J a [(*£ )' P\ ' Vi dx > 

where the first term is again zero owing to our assumption. As a result, the 
sum of integrals I in (15.47) reads 


(■ Vi > Ly j) = ^ / { [- (y* )' p\ ' yj - y* QVj } dx 


(15.48) 


1 

W 


y*j (py'i)' - ViQVj dx) = (: yj,L yi )*, (15.49) 


which completes the proof. 


15.3.5 Reality of Eigenvalues 


4k Theorem: For a Sturm-Liouville system under the boundary condition 
(15.46), we have: 

(a) All eigenvalues are real. 

(b) Eigenfunctions corresponding to distinct eigenvalues are orthogonal. 
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Proof of (a). If an eigenfunction y n belongs to the eigenvalue A„, then 


Vn) — (AnlM- Vn) (Ay n ,7/ n ) 

— (Vn-i L'ljn) — A niyuiVn)- 

This indicates that A* = A„ since (y n ,y n ) > 0. Therefore A„ is real for all n. 

X 

Proof of (b). According to the same argument as above, 


Am (Vm , Vn ) — (Am^/miZ/ro) — ( k]J m . Vn) 

= {.Umi Lyn) = ^n(ymi yn) • 

Thus, for A m ^ X n , (y m ,y n ) = 0, which means that eigenfunctions corre- 
sponding to distinct eigenvalues are orthogonal. X 

Remark. If eigenvalues are degenerate, say, A m = A n (to n), an orthogonal 
set of eigenfunctions is constructed using the Gram— Schmidt orthogonal- 
ization method. Namely, we can choose the eigenfunctions to be orthogonal 
to each other with respect to the weight function w such that if ( y m , y n ) 0, 

we replace y n by y n =y n - ay m where a should be chosen to be ( Uni . yn) — 0- 


Exercises 


1. Show that the Bessel equation given by 

x 2 y" + xy' + (x 2 — n 2 ) y = 0 with n > 0 and x € (— oo, oo) 
can be expressed in the form of a Sturm-Liouville equation. 

Solution: After the transformation x — > kx, we have 

[xy'(kx)]' + — + k 2 x^j y(kx) = 0, n > 0, 

where p = x, q = —n 2 x, w = x, and the parameter A = k 2 in 
(15.38). X 

2. The Bernoulli equation is given as a nonlinear equation by 

y ' = a(x)y + b(x)y k , (15.50) 

where a(x), b(x) are continuous functions in an interval / and k is an 
arbitrary constant. 

(a) Show that the transformation u = provides an inhomogeneous 
linear equation for u. 
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(b) Find a solution for the transformed linear equation for u under the 
initial condition u(x o) = Uq. 

Solution: 

(a) The transformed equation becomes v! = (1 — k)a(x)u + 

(1 — k)b(x). 

(b) The above equation can be reduced to an inhomogeneous 
linear equation of the form 

u = p(x)u + q(x), 

where p{x) = (1 — k)a(x), q(x) = (1 — k)b{x) are continuous 
functions. Let P{x) be a function whose derivative is p{x) 
such that 


P(x) = ( p(t)dt , 
J Xc\ 


where xq is a fixed point in I. Multiplying both sides of 
(15.50) by e p ^ to, we have the relation 

(e p u)' = e p (u — pu) = e p q. 

Therefore, we obtain a solution such that 

u(x) = u 0 e- p(x) + e~ p(x) f X e p{s) q(s)ds, 

J Xo 

where uq comes from the initial condition. X 
3. The logistic equation is a special type of Bernoulli equation given by 


y ' = ay- by 2 


(15.51) 


where a, b are constants. Find a solution for the above by imposing the 
initial condition y(x o) = j/o- 

Solution: Using a solution for Exercise 2(b) by setting k = 2, 
we have 

v (x) = 

b+ (a/j/o — b)e~ a ^ x ~ x °^ 

Note that y(x) = a/b as x — > oo. X 
4. The Riccati equation is a nonlinear equation given by 

y' + p(x)y + q(x)y 2 = r(x). (15.52) 

(a) Assuming u(x ) to be a particular solution of the above, namely, a 
solution when we set r(x ) = 0, show that z(x ) defined by y(x) = 
u(x ) + z(x ) constitutes the Bernoulli equation. 

(b) Show that the Riccati equation is reduced to a linear equation of the 
second order by the transformation y = Qv' jv. 
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Solution: 

(a) Substituting y = u + z into the equation, we have 

\z! + p(z 2 + 2 uz) + qz ] + \u + pu+ qu 2 — r] = 0. 


The second parenthesis vanishes and we have the Bernoulli 
equation such that z' + (2 up + q)z + pz 2 = 0. 

(b) The first order derivative gives 


y' = Q 




Q' 


Thus, we have 

v" v' 2 v' 

Q b ( pQ — 1 )Q — v + ( Q ' + qQ) b r = 0. 

V V V 

Setting Q = l/p(x), we have v" + (q — v' +prv = 0. A 
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System of Ordinary Differential Equations 


Abstract In this chapter we focus on an autonomous system (Sect. 16.3), which 
is a specific type of system of ordinary differential equations. Autonomous systems 
can be used to describe the dynamics of the physical objects that are encountered 
in physics and engineering problems, wherein the laws governing the motion of the 
objects are time-independent, namely, they hold true at all times. The stability of 
these dynamical systems is characterized by the critical point (Sect. 16.3.3), whose 
nature is revealed by the functional form of the autonomous systems. 


16.1 Systems of ODEs 

16.1.1 Systems of the First-Order ODEs 

This section deals with n coupled ordinary differential equations (ODEs). The 
formal definition is stated below. 


Systems of ODEs: 

A system of ODEs is given by 



Ft 

x; 2/1, yi, yi" , • • • , 2/1 (m) ; 2/2, 2/2', 2/2", • • • 

) 2 / 2 (r * 2) ; ■ ■ ■ 


= 0 

(* = 1 , 2 ,---), 


( 16 . 1 ) 

which involves a set of unknown functions y± ( x ) ,2/2(2:)) 
fives with respect to a single independent variable x. 

■ • • and their deriva- 


For each ith equation of ( 16 . 1 ), we denote the highest order of the derivatives 
of jjj by rij. Hereafter, we consider the case of = 1 for all i and j, i.e. , a 
system of n ordinary differential equations (ODEs) of the first order expressed 

by 
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y'i(x) = ,y n ), 

y 2 (x) = h{x,y\,y 2 i ■ ■ • ,y n ), 

y' n (x ) = f 2 (x,yi,y 2 ,--- ,y n )- (16-2) 

Here, { //,- } , k = 1, 2, • • • , n are single- valued continuous functions in a certain 
domain of their arguments and {yk} , k = 1,2, • • • ,n are unknown complex 
functions of a real variable x. 


16.1.2 Column- Vector Notation 

For convenience we use column-vector notation for an ordered set of un- 
known functions {yk(x)} in which each yk(x) is called a component, which we 
denote by a bold-face letter: 

y(x) = [yi(x),y 2 (x), • • • , y„(a;)] T , (16.3) 

where the norm of the vector is defined by 

\\y(x)\\ = (M 2 + Ol 2 + ••• + |y„| 2 ) 1/2 . (16.4) 

Using vector notation, we can express (16.2) in the concise form 

y\ x ) = f(x,y(x)), (16.5) 

where the column vector / is defined by its components 

f(x,y(x)) = [fii fii * ,/n] T - (16.6) 

If there exists a set of functions <p{x) = ip 2 (x),- ■ ■ , <p n { x )) satisfying 

Vi{x)' = fi (x,<pi(x),<p 2 (x),--- ,<Pn(x)) , i = 1,2,-- • ,n, 

we say <p{x) is a solution of (16.2). The initial value problem consists of finding 
a solution <p(x) of (16.5) in I satisfying the initial condition <p(x o) = y 0 = 
(j/ 10 ) 2 / 20 , • • • ,y n o)- 


16.1.3 Reducing the Order of ODEs 

Let consider an nth order ODE of u(x) given by 


d n u{x) d n 1 u(x) 


+ Pn{x)u{ x) = q(x). 


(16.7) 


We show that equation (16.7) can always be reduced to a system of n first- 
order differential equations, which is stated as follows: 
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6 Theorem: 

Given an nth-order ODE, it can always be reduced to a system of n 
first-order ODEs. 

Proof We take u( x) and its derivatives as new unknown 

functions defined by 

r]k I'll ( 7*1 

y k (x)= dx k-i ■ k = l,2,-.-,n. (16.8) 

It is evident that (16.7) is equivalent to the following set of equations: 

y'i = y 2 , 2/2 = 2 / 3 , •••, y' n -i=yn (16.9) 

and 

y'n = -PlVn - P 2 y n -1 VnV 1 + Q- (16.10) 

Equations (16.9) and (16.10) can be written in a brief vector form as 

= /(*,»), (16.11) 

where the column vectors are defined as 

y= (z/i, 2 / 2 , - • • ,y n ) 

and 


/ = f{x,y) 

= [2/2, 2/3, • • • , 2/n, PlVn ~ P2y n -1 Pn 2/1 + <?] T • * 


Example One of the most famous systems of the type (16.11) results from the 
equation of motion for a particle of mass m. For a mobile particle along the 
:r-axis, the equation of motion is 


m 


d 2 x(t) 

dt 2 


F t, x(t), 


dx(t) \ 

dt J ’ 


(16.12) 


where t is the time and F represents the force acting on the particle. To see 
how the second-order ODE (16.12) can be viewed as a system of the form 
(16.11), we make the following substitutions: 

dx 

t -» X, x — * 2/1, -> 2/2- 

Then (16.12) is equivalent to a system of two equations: 

2/i = 2/2, 

2/2 = ~F{x,yi,y 2 ) , 
m 

which is of the form of y f (x ) = f(x,y). 
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16.1.4 Lipschitz Condition in Vector Spaces 

The vector equation 

y\ x ) = f(x,y) (16.13) 

is obviously analogous to the scalar equation 

■y\x) = f(x,y). 

This analogy implies the possibility that the definition of a Lipschitz condi- 
tion can be extended to the vector equation. The extended Lipshitz condition 
provides a simple sufficient condition for the uniqueness and existence of so- 
lutions, which implies that all the theorems for the scalar equation can be 
generalized so as to hold for the vector equation. 

4 Lipschitz condition for a vector function: 

A vector function f{x,y) in (16.13) is said to satisfy the Lipschitz 
condition on a region R if and only if 

1/ (x, y{x)) - f(x,z(x))\ < K \y(x) - z(x )\ , 

(R : \x-x 0 \ <a, \y - y 0 \ < b, \z-z 0 \<b). (16.14) 

for the Lipschitz constant K. 


When f(x, y) satisfies the Lipschitz condition noted above, we see from (16.14) 
that 


\.fk{x, ?/l, 2/2) • ' ' ,y n ) - fk{x,Z 1 ,Z 2 ,-‘- ,Zn) I 
< K(\yi - Zi \ + \y 2 - Z 2 \ -\ b \y n ~ z n \) (A = 1,2,--- ,n). (16.15) 

Using this, we can prove the theorem of the existence and uniqueness of so- 
lutions for the general vector equation (16.13). For instance, the uniqueness 
of the solution for (16.13) is straightforward as shown below. The right-hand 
side of (16.15) yields 

n n ~ x 

K T I Vk(x) - Z k (x ) I <K / \fk(x,y(x)) - fk{x,z(x))\ dx 

fc= l k=i^ x ° 

rX n 

< nK 2 / ^ | y k (x) - Zk{x)\dx, (16.16) 

•* x ° fe= l 

which holds for the interval I; x € [xq,Xo + <5] for any small S. Since the left- 
hand side of (16.16) is continuous on J, it has a maximum at some x, which 
we label y. Then, the inequality (16.16) becomes 

/i < nKy,{x — Xq) < nKyS , 
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which gives us /z(l — nKS) < 0. For any small 6 > 0, we have ft = 0, which 
indicates that J2\lJk~ z k\ = 0. The same holds true for the case x < x 0 . Thus, 
the solution of (16.13) is unique. 


Exercises 

1. Consider a initial value problem given by 

y' = f(x,y), y{x 0 ) = y 0 , 

defined on R : \x — xq\ < a, \y — y Q \ < 6, (a, b > 0). Assuming that / is 
continuous on i?, a sequence of successive approximations is 

given by 

<Po(x) = Vo 

and 

‘Pn+i(x) = yo+ f&Vnifydt for n= 1, 2, • • • . 

Jx 0 

Using this procedure, find a sequence of successive approximations for 
(2/1 >2/2) = (V 2 ,-yi), for 2/(0) = (0,1). 


Solution: Here f(x,y) = ( 2 / 2 , — 2 / 1 ), so we have 

<Po( x ) = ( 0 ,!), 

<Pi(x) = (0,1) + f (1, 0)dt = (x, 1), 

Jo 

<P 2 (x) = (0, 0+ J o (!> -t)dt = (0, 1) + = ( x ’ 1 " y) ■ 

Continuing with this process, we find the solution of the problem as 
<Pk( x ) —> <p(x) = (sin x, cos x). X 


16.2 Linear System of ODEs 


16.2.1 Basic Terminology 


We now focus on a particular class of systems of ODEs called a linear system 
of first-order ODEs, described by 


dyi(x) 


n 

J2 a ij(x)yj{x) 

i= 1 




dx 
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dy 2 {x) 

dx 


J2 a 2j{x)yj{x) 
j = i 


q2(x), 


dy n {x) 

dx 


Y,anj(x)yj(x) = q n {x). 
3=1 


Here akj(x) and qk(x) with j,k = 1,2, ••• ,n are continuous functions 
on x on some interval I. For convenience we use the vector representation 
given by 

- A(x)y(x) = q(x), (16.17) 

dx 

where A = [ a^j } is an n x n matrix. Therefore, Ay stands for the matrix 
A applied to the column vector y = [yi,y 2 ,--? j y n ] T i namely, the linear 
transform of y by A. The vector q is defined as q = \qi, q 2 , ■ ■ ■ , q n } T ■ Given 
any y(xo) for Xq in I, there exists a unique solution ip(x ) on I such that 

v{xo) = [yi{x 0 ),y 2 {x 0 ), - ■ ■ ,y n {x 0 )] T . 

The use of the linear operator L to (16.17) yields 


L\y{x)\ = q(x), 


where the L is defined as 

L= d --A. (16.18) 

dx 

If q( x) = 0 for all x on /, (16.17) is said to be a linear homogeneous 
system of nth order, expressed by 

dy(z) _ A(x)y{x) = 0. (16.19) 

dx 

Otherwise, (16.17) is called inhomogeneous. A homogeneous system ob- 
tained from the inhomogeneous system (16.17) by setting q(x) = 0 is called 

the reduced or complementary system. 


Remark. Note that every linear homogeneous system always has a trivial 
solution x) = 0, as can be immediately checked. From the uniqueness of 
the solution, therefore, there is no solution vanishing at only some point of x. 


16.2.2 Vector Space of Solutions 

Let iPj (x) (i = 1, 2, • • • ) be solutions for an ?r-dimensional linear homogeneous 
system 

y'(x) = A(x)y(x). (16.20) 

Referring to the axioms given in Sect. 4.2.1, it readily follows that the solutions 
{<^,j(a;)} form a vector space V. Indeed, if ^(x) and ip 2 (x) are solutions of 
(16.20), then ciip 1 (x)+c 2 (p 2 (x ) with arbitrary constants Ci, c 2 is also a solution 
of (16.20), and so on. 
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Now we pose a question as to the dimension of the vector space V men- 
tioned above. We have the answer in the following theorem: 

Theorem: 

Solutions of the system (16.20) on an interval / form an n-dimensional 
vector space if the n x n matrix A{x) is continuous on I . 


Proof The continuity of A(x) implies that all its components do not diverge. 
This allows us to set a constant K, 

n 

I\ = max \dij (x) | , 

i—l 

and it then follows that the vector / defined by f(x) = A(x)y{x) satisfies the 
Lipschitz condition: 

\f(x,y) - f(x,z)\ < K\y- z\ for x € I. 

From the existence and uniqueness theorems we know that there are n solu- 
tions <Pi(x) of (16.20) such that each solution exists on the entire interval I 
and satisfies the initial condition 

<Pi( x 0 ) = ei (i= 1,2,- •• ,n) for x 0 € I, (16.21) 

where the e(s are n linearly independent vectors. 

We tentatively assume that the solutions p> i are linearly dependent on I. 
Then there exist constants Cj, not all zero, such that 

n 

E c ^ i (x) = 0 for every x on I. 

i—l 

In particular, setting x = Xq, and using the initial condition (16.21), we have 

n 

^ ' Ci&i 0, 

i—l 

which contradicts the assumed linear independence of e.j. Hence, we conclude 
that the solutions tp, are linearly independent on I. 

Next we prove the completeness of {<^( 2 ;)}; i.e., that every solution if{x) of 
(16.20) can be expanded as a linear combination of p>i(x) satisfying the initial 
condition (16.21). Since the e,; are linearly independent in the n-dimensional 
Euclidean space E n , they form a basis for E n and there exist unique constants 
bi such that the constant vector if(xo) can be expressed as 

n 

t/K^o) = E & * ei - 

i= 1 


(16.22) 
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Consider the vector 

n 

V{x) = '52k‘P i (x), 

i—l 

where the 6,; are identical to those in (16.22). Clearly <p(x) is a solution of 
(16.20) on I. In addition, the initial value of tp reads 

n 

<p(xo) = 

i—l 

so that ip( xq) = ip(xo). In view of the uniqueness theorem, we have 
<p{x) = iftix) for every x on I. 

This leads to the conclusion that every solution ip(x) of an nth-order linear 
homogeneous system (16.20) is expressed by the unique linear combination 

n 

ip(x) = bitp^x) for every x on I, 

i—l 

where the b t are uniquely determined once we have ip{x). As a result, n 
solutions p>i(x) of the system (16.20) form the basis for an n-dimensional 
vector space. £ 

16.2.3 Fundamental Systems of Solutions 

Again let <Pi(x) = \<pu{x), ■ ■ ■ ,tp n j(a;)] T (i = 1,2, ••• ,n) be solutions of the 
linear homogeneous system (16.20) such that 

iPi{x)' = A(x)<p i (x) for all i = 1, 2, • • • , n. 

Note here that {<^( 2 ;)} may or may not be linearly independent, since no 
initial condition is imposed (contrary to the case of (16.21)). Specifically, if 
the set {<^( 2 ;)} is endowed with the linear independence property, it is called 

the fundamental system of solutions of (16.20). 

4 Fundamental system of solutions: 

A collection of n solutions {<^, ( 2 :)} of an n-dimensional lienar homoge- 
neous system is called a fundamental system of solutions of the system 
if it is linearly independent. 


Remark. The significance of a fundamental system of solutions lies in the fact 
that it can describe any solution ip( x) of the corresponding linear homoge- 
neous system. Consequently, the problem of finding a solution tp(x) becomes 
equivalent to that of finding n linearly independent solutions. 
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With this terminology, the theorem presented in Sect. 16.2.2 leads to the 
following result: 

4 Theorem: 

A fundamental system of solutions exists for an arbitrary linear homo- 
geneous system. 


Example The second-order equation 


y"(t) + y(t) = 0 

(16.23) 

is equivalent to the two-dimensional linear system 


11 

(16.24) 

with 

u{t)= ll(t)] and A= (-Vo)- 



The fundamental system of solutions of (16.24) is given by 

<Pi(t) = [cost, — sint] T and <p 2 {t) = [sin t, cost] T , 


whose linear independence follows from the fact that Ci sin t ± c 2 cos t = 0 
implies ci = c 2 = 0. Furthermore, <Pi(0) = (1,0) and v? 2 (0) = (0,1), so any 
solution <p(t) is given by 

<p(t) = ao¥>i(t) + boip 2 (t) for — 00 < t < 00 , (16.25) 

where <p(0) = (a 0 ,6 0 ). 

Remark. The solution ip(t) in (16.25) corresponds to the solution of the 
second-order ODE (16.23) satisfying the initial conditions: j/(0) = ao and 

2/(0) = bo- 


16.2.4 Wronskian for a System of ODEs 

The theorems given in Sect. 16.2.2 and 16.2.3 ensure the existence of a fun- 
damental system of solutions for any linear homogeneous system of the form 

y'{x) = A(x)y{x). (16.26) 

However, it provides no information as to whether a certain set of solutions 
is a fundamental system or not. In what follows, we consider the criteria 
concerning this issue. Following are preliminary concepts that we need in 
order to proceed. 
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4 Wronsky determinant: 

Let {<p k {x)} ( k = 1,2, , n) be solutions of (16.26), where tp k (x) = 

[ip\ k {x), • • • , tp n k{x)] T ■ Then the scalar function 


W ( x ) = det 


i <Pi2 

‘ ’ ’ ^Pln 

V?21 <^22 

■ ■ ■ <P2 n 


^ nn 


(16.27) 


is called the Wronsky determinant (or the Wronskian) of the solutions 
Wk( x )}- 


If {<p k (x)} is a fundamental system of solutions of (16.26), then the matrix 
corresponding to W(x) is called a fundamental matrix. Hence, a fundamen- 
tal matrix is a matrix whose columns form a fundamental system of solutions 
of (16.26). 

Example For the two-dimensional system given in Sect. 16.2.3, the matrix 


m = 


f cos t sin t, \ 
y — sin t cos t J ’ 


— 00 < t < 00 


is a fundamental matrix and W(t) = 1 for all t. 


16.2.5 Liouville Formula for a Wronskian 

The following theorem shows that given any n solutions of (16.26) and any to 
in (n, r 2 ), we can completely determine the corresponding Wronskian without 
computing the n x n determinant. 


A Liouville formula: 

Let {<£> fc (a:)} (k = 1, 2, • • • , n) be any 

n solution of (16.26) and let xq be 

in (ri,r 2 ). Then the Wronskian of {c p k {x)} for x € is given by 

W(x ) = W(a:o)exp 

/ trA(5)ds 



-J Xq 



See Exercise 2 for the proof. Since exp 


/" trA(s)ds 


is never zero, the 


theorem implies that the Wronskian of any collection of n solutions of (16.26) 
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is identically zero or never zero on (rq,^). The latter case characterizes a 
fundamental system, as shown by the following theorem: 


4 Theorem: 

A necessary and sufficient condition for {ip k {x)} ( k = 1,2, ••• ,n) to 
be a fundamental system of solutions of (16.26) is that W(x) ^ 0 for 
r\ < x < T 2 - 


Proof Let {ip k (x)} (k = 1,2, •• • ,n) be a fundamental system of solutions of 
(16.26) and let <p{x) be any nontrivial solution. Then there exist ci,--- ,c n 
not all zero such that <p(x) = and by the uniqueness of the 

solutions the Ci are unique. If c = [ci, • • • ,c„] T and ${x) is the fundamental 
matrix of {<£> fc (a;)}, then the previous relation can be written as 

<p{x) = c&(x). 

For any x in (rq,/^), this is a system of n linear equations in the unknowns 
Ci, ■ ■ ■ ,c n . Since this has a unique solution in c, det# cannot be zero, i.e., 

det^(x) = W(x) ^ 0 for any x € (r i,r 2 ). 

Conversely, W(x) ^ 0 for rq < x < r 2 , implies that the columns 
>Pn( x ) &(%) are linearly independent for n < x < r 2 - Since 
they are solutions of (16.26), they form a fundamental system ofsolutions. A 


16.2.6 Wronskian for an nth-Order Linear ODE 

The previous results for systems of ODEs can be applied to an nth-order 
linear equation 

u^ n \x) + ai(a:)u^ T, ' _1 ^(a;) + • • • + a n (x)u(x) = 0, (16.28) 

since (16.28) is transformed into a vector form as 

?/ = Ay, (16.29) 

where 


u 


0 10-' 

0 

v! 

and A = 

0 0 1 •• 

0 



0 

0 1 

u (n~l) 


[ a n (x) a n -i(x) 

■ • — U2(x) — ai(x) 
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Relevant terminology and theorems are given below. 

4 Fundamental system of solutions: 

A collection 

£i(z), ’ ’ ’ )£n(a;), n < x < r 2 

of solutions of (16.28) is called a fundamental system of solutions of 
(16.28) if it is linearly independent. 

4 Theorem: 

A fundamental system of solutions of equation (16.28) exists. 


Proof We know that a fundamental system of solutions of (16.29) exists, and 
we express it by </? 1 (ar), - - - ,ip n (x), where ifi k (x) = [ipi k {x), • • • ,ip nk (x)] T . 
Furthermore, we may assume that given xo in {r\,r 2 ), 

Pk(x 0 ) = [0, - - - ,0,1,0, •• • ,0] T = e k , k = 1, 2, - - • ,n, 


where the single nonzero component 1 in e k is assigned to the fcth place in the 
square brackets. By the correspondence of solutions of (16.28) and (16.29), we 
have 


¥>fcO) 


- 1] (*) 


for some solution u(x) = f k (x) of (16.28). The collection £i(a:),--- ,f n (x) 
comprises distinct nontrivial solutions, since they satisfy distinct initial con- 
ditions and f k = 0 for rq < x < r 2 would imply that <p k {x) = 0 , which is 
impossible. 

Finally, if there existed constants c\, ■ ■ ■ ,c„ not all zero such that c fc 

f k (x) = 0 for n < x < r 2 , then 


^2 c k^k(x) =0,-“ 1) ( a; )= 0 > n<x<r 2 . 

k = 1 k= 1 


This implies that 

n 

c kPk(x) = 0, n <X < r 2 , 

fc = 1 

which contradicts the fact that {ip k (x)} is a fundamental system of (16.29). 

* 


We now define the Wronskian of a collection of n solutions of (16.28). 
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4k Wronsky determinant: 

Given any collection £i(x), 

• ’ >€n(x) of solutions of (16.28), then 


£l 6 ••• & 

& & ■■■ c 


W (x) = det 

An- 1 ) An-1) An-1 ) 

.SI S2 " ‘ S n 

(16.30) 

is called the Wronsky determinant (or the Wronskian) of the solutions 
{£fc(®)} (k = 1,2,- •• , n). 


As before, if £i(x), • • • , £„(x) make up a fundamental system of (16.28), then 
the matrix corresponding to W(x) is called a fundamental matrix. In 
any case, note that the columns of the matrix corresponding to W(x) are 
n solutions of the system (16.29). We may therefore immediately state a 
result analogous to the Liouville formula given in Sect. 16.2.4, noting that 
trA(x) = — ai(x): 


4 Theorem: 

The Wronskian W(x) of any collection £i(x),--- , £„(x) of solutions of 
(16.28) satisfies the relation 


W(x) = lT(xo)exp 



ri < xq, x < r 2 . 


Finally, we have the result corresponding to the theorem in Sect. 16.2.4, for 
which the proof is virtually the same. 

Theorem: 

A necessary and sufficient condition for £i(x), ' ' • , £n(aO to be a funda- 
mental system of solutions of equation (16.28) is that 

W(x) ^ 0 for rq < x < r 2 . 


Example Assume a second-order equation 


y"(x) + a(x)y(x) = 0. 
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For any two solutions £ 1 ( 2 ;) and ^(x), we have 

W(x) = det f = const. 

The constant is nonzero if and only if £1 and £2 are linearly independent. 

Remark. The fact that linear independence implies a nonvanishing Wronskian 
is a property of solutions of linear equations; i.e., it does not hold for nonlinear 
equations. To see this, we consider the functions £ 1 ( 2 ;) = x 3 and £ 2 ( 2 ) = M 3 - 
They are linearly independent on —00 < x < 00 , but 

W(x) = det ( J*!*! ) = 0. 

This results from the fact that £i(a:) and £ 2 ( 2 ) cannot both be solutions near 
x = 0 of a second-order linear equation. In fact, they both satisfy £(0) = 
£'(0) = 0 yet are distinct, which violates uniqueness. 


16.2.7 Particular Solution of an Inhomogeneous System 

We close this section by discussing an inhomogeneous linear equation 

- A(x)y(x) = q(x). (16.31) 

Let q(x) be continuous on x on some interval / and let {<p fc } (k = 1, 2, • • • , n ) 
be a fundamental system of solutions for the reduced equation of (16.31). A 
general solution of (16.31) can be written as the sum 

^(x) = <f p (x) + cw^x) H b c n ip n {x), (16.32) 

where <p p (x ) is a particular solution of (16.31) with no adjustable parame- 
ter. 

A particular solution can be obtained from a fundamental system {ip k } 
(k = 1, 2, • • • n) of the reduced equation (16.19) by means of the method of 
variation of constant parameters. We assume a particular solution of the 
form 

<P P (x) = Ci (®) (#) + • • • + C n (x)<p n (x), (16.33) 

where the coefficients {C k ( x)} (k = 1,2, , n) are not constants, but un- 

known functions of x. Differentiating (16.33) on x and substituting it into 
(16.31), we obtain 

n 

\ C k{ x Wki x ) + C k( x )<Pk( x ) - C k (x)A{ x)<p k {x)\ = q{x). (16.34) 

k = 1 
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Since {<p k } (k = 1,2, ••• , n) are solutions of the reduced equation (16.19), 
equation (16.34) yields 

n 

^k(x)C' k (x) = q(x). (16.35) 

k = 1 

If we express <p k (x ) by its components as 

<Pk( X ) = [‘Pkl(x),‘Pk2(x),---ip kn (x)} , 
equation (16.35) becomes 


^2ip k j(x)C' j (x) = q k (x), 
i~i 


or equivalently, 


V’li 

Vl2 

* * * ty?ln 

P 21 

P22 

* ‘ ‘ <^2n 

Pn\ 

Pn2 

(Pnn 


' C[ (x) ' 


'Qi(x)' 

C' 2 (x) 


<&( x ) 

C’ n {x)_ 


_Qn(x)_ 


(16.36) 


(16.37) 


The matrix [tpkj\ on the left-hand side of (16.37) satisfies det [pkj\ 7 ^ 0 be- 
cause of the linear independence of the fundamental system of solutions {<p k }- 
Hence, multiplying the inverse matrix (see Sect. 18.1.7) of [<Pkj] by the both 
sides of (16.37), we have 

C'k(x) = Pk(x), (16.38) 

where {p k (x)} , k = 1 , 2 , ••• ,n are continuous functions obtained from 
(16.37). Thus once the differential equation (16.38) is solved with respect 
to Cfc(: r), the solutions determine a particular solution of the form 


‘Pp(x) = J^C k (x)cp k (x). 

k— 1 


Exercises 


1. Suppose <pi(x), tp 2 (x) to be two solutions of the ODE y" + ary' + a-^y = 0 
on an interval I containing a point xq. Show that 

W(<pi,<p2)(x) = e~ a ^ x - x ° ) W{tpuv 2 ){x 0 ). 

Solution: We have ipi" + a\ipi + a-zipi = 0 and ip 2 " + a\<p 2 + 02^2 = 
0. Multiplying the first equation by — <p 2 , and the second by ipi and 
adding we obtain 

{P1P2" - ipi"<P 2 ) + 01 (<PiV 2 - V1V2) = 0 . 
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Note that W = ip\g >2 — WxWi and W = y>\<P 2 — Vi'vi- Thus W 
satisfies the first-order equation: 

W' + a\W = 0, 


which implies W(x) = ce aiX in which c is some constant. Setting 
x = £ 0 , we have c = e~° lX °W(x 0 ), and thus 

W(x) = e- ai(x ~ Xo) W(xo). * 


2. Assume an n-dimensional linear homogeneous system y'( x) = A(x)y(x) 
on I = (a, b ), and let {g^x)} be any n solution. Show that the Wronskian 
of {9i{ x )} is S iven by 


W{x) = VE(:ro)exp 


trA(s)ds 


, where a < Xq < b, 


(16.39) 


which is called the Liouville formula. 


Solution: We show that W (x) satisfies the differential equation W' (x) 
tr A(x)W(x) from which the conclusion (16.39) follows. The expansion 
by cofactors of W (x) yields 

n 

W(x) = y^ j <p i j(x)A i j(x), (16.40) 

3 = 1 


where ifij( x ) is the jth element of <fii(x) and /!,;_,■( x) is the cofactor 
of W(x) (see Sect. 18.1.7 for the definition of the cofactor). Note that 
Aij{ x) does not contain the term tpij(x). Hence, if W{x) given in 
(16.40) is regarded as a function of the ipij(x), we have dW/dipij = 
Aij{x) and, by the chain rule, 


w <*y= t ^w=f: 

i,j — 1 J i— 1 

e W4, 

We define Wi(x) as 

( vu{x) 

<Pln{x) 

Wi(x) = det 

Vn{x)' ■ 

<Pin( x ) 


\V> n i(a:) 

<Pnn{x) J 


(16.41) 


where all the elements in the itli row are differentiated. Then, the 
expression in the square brackets in (16.41) is the expansion of W^x) 
by cofactors, so that 


W(x)’ = Y J W i {x). 

i — 1 


(16.42) 
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Furthermore, since = Y^k=i a ik( x )‘Pkj{ x )> we have 

( <Pu{x) <Pi n(x) \ 


Wi{x) = det 


V <Pnl( x ) 


— i ^ik^Pknix) 
<Pnn(x) ) 


Multiply the kth row (k ^ i ) of the left matrix by —aik(x) and then 
add it to the ith row. This process does not change the value of the 
determinant Wi(x), but gives the relation 


= au(x)W(x). (16.43) 


/ Pll{x) Pln{x) \ 

Wi(x) = det auipii(x) au<Pin( x ) 

V <fnl( x ) Pnn{x) J 

From (16.42) and (16.43), we arrive at the desired result. & 


16.3 Autonomous Systems of ODEs 

16.3.1 Autonomous System 

We noted earlier that an ntlr-order ODE reduces to the first-order form: 


s l 


yi(x) 


Fi{x-, 2/i, 2 / 2 , - • 

■ ,Vn) 

V2{x) 

= 

F 2 (x;yi,y2,-- 

' ,Vn) 

_Vn(x)_ 


_F n {x-,yi,y2,-‘ 

■ ,Vn)_ 


= F(x,y), 


(16.44) 


where y{x) and F{x, y) are n column vectors. Particularly important in many 
applications is the case where F{x,y) does not depend explicitly on x. Rele- 
vant terminology is given below. 


4 Autonomous system of ODEs: 

A system of a first-order ODE of the form 

y\x) = F(y) 

is called an autonomous system, wherein F does not depend explicitly 
on the independent variable x. 
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If F does depend explicitly on x, the system is said to be nonautonomous. 
Example Consider a second-order ODE of the form 


u"{x) = f (u(x),u'(x)) . 


Setting y i(x) 
as 


= u(x) and y 2 (x) = u'(x), we have an autonomous system such 


d 

2/1 0*0 


2/2(*) 

dx 

2/2(z) 


f{yi{x),y 2 {x)) 


F(y). 


16.3.2 Trajectory 


As a prototype of autonomous systems of ODEs, we consider a two-dimensional 
system such that 


d 

2/1 (*) 


fi (2/1, 2/2) 

dt 

.2/2 (t) 


/2(2 /i, 2/2) 


f(y ), 


(16.45) 


where yi(t), z/ 2 (^) are unknown functions on t in some interval I. We assume 
that /i(y 1,2/2) and /2 ( i/i , 2/2) are defined in some domain D and satisfy the 
Lipschitz condition on both yi(t) and 1/2 (£)• If to is any real number and 
(2/10, 2/20) G D for any y 10 = 2/1 (to) and 2/20 = 2/2 (to), the above hypotheses 
guarantee the existence and uniqueness of solutions for (16.45), 


2/1 (t) = 1/2 (t) = <P2(t), 

satisfying the initial conditions 

^i(to) = 2/io, <P2(to) = 2/20- 


We now consider a subdomain i? of D in which /1 (2/1, 2/2) does not vanish. 
Then, we have in i? the relation 


dy2 _ dj(2 dt_ _ dy 2 /dt _ f 2 (2/1, 2/2) 
di/i (tt dyi dyi/dt /1 (2/1 , 2/2 ) ’ 


(16.46) 


which represents a direction field in (2/i-2/2)-plane as noted in Sect. 15.1.6. 
From the uniqueness theorem, there exists a unique integral curve of (16.46) 
in R satisfying the initial conditions. Such an integral curve on (2/i-2/2)-plane 
is called a trajectory of (16.45). 


4 Theorem: At most one trajectory passes through any point. 


Proof This is obvious from the uniqueness of solutions. If not, two or more 
trajectories emerge from the crossing point chosen as an initial value 
point. £ 
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Remark. When the vector field 

/i (*1,3:2) 

f2(xi,x 2 )_ 

describes the motion of a point in R, the domain R is called a phase space 
of the system (16.45). 


v(x,y) = 


U ' 1 

*2 


16.3.3 Critical Point 

Suppose that the autonomous system (16.45) has a time-independent solution 
expressed by 

<p(t) = c G D, 

where c = ( 01 , 02 ) is a constant vector. Then, no trajectory can pass through 
the point c (see the theorem in Sect. 16.3.3). In addition, we obviously have 

V'(t) = 0 = /(c). 

Conversely, if there exists a point c in R for which /(c) = 0, then the functions 
ip(t) = c are solutions of (16.45). The point c is said to be a critical point 
(or singular point or point of equilibrium). 

4 Critical point: 

Assume an autonomous system 

y'( x) = F(y ) for y G D. (16.47) 

Then, any point c G D that gives 

F(c) = 0 

is called a critical point of (16.47). Any other point in D is called a 

regular point. 


16.3.4 Stability of a Critical Point 

Let us discuss the stability of a critical point of an autonomous system 
(16.45) by analyzing trajectories of its solutions around the critical point. 

We assume throughout that the function F(y) is differentiable of the first 
order on D, which guarantees the existence and uniqueness of solutions of 
the initial value problem (16.45). Then, the solutions of (16.45) can be con- 
veniently pictured as curves in the phase space. 
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Now we consider a solution tf(x) of (16.45) that passes through the point 
r) for Xq, where the distance between r/ and c is small. Let us now follow 
the trajectory that starts at a point r) different from y 0 , but near c. If the 
resulting motion i/> remains close to the critical point c for x > Xq, then the 
critical point is said to be stable, but if the solution i/> tends to return to 
the critical point c as x increases to infinity, then the critical point is said 
to be asymptotically stable. Finally, if the solution xjj leaves every small 
neighborhood of c, the critical point is said to be unstable. More precisely, 
we have the following definitions: 


4 Stability of a critical point: 

Let c be a critical point of the autonomous system y'(x) = F(y), so 
that F(c) = 0. The critical point c is called: 

(i) stable when given a positive e, there exists a <5 so small that 

|y(0) — c| < S => | y(x) — c\ < e for all x > 0; 

(ii) asymptotically stable when for some 6, 

|y(0) — c| < S => lim \y(x) — c| = 0; 

x — »oo 

(iii) strictly stable when it is stable and asymptotically stable; 

(iv) neutrally stable when it is stable but not asymptotically stable; and 
(v) unstable when it is not stable. 


16.3.5 Linear Autonomous System 


An autonomous system y' = F(y) is called linear if and only if all the 
elements F,; of F are linear homogeneous functions of the yk, so that 


dyk 

dx 


an yi H V a in y n (i = 1, •■•,«). 


Hence, a linear autonomous system is just a (homogeneous) linear system of 
ODEs with constant coefficients. The analyses for linear systems are generally 
useful since we can always replace Fj(y) by the linear terms of their Taylor 
expansions about a point y = y Q for analyzing their local behavior. 

We now discuss in detail the case n = 2 of linear plane autonomous systems 
of the form y' = Ay. Any such system is expressed by 


dx 

dt 


dy 

dt 


ax + by, 


cx + dy, 


(16.48) 
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where x = y\, y = 3 / 2 ? and 



with a, b, c, d being constants. Observe that the simultaneous linear equations 

ax + by = cx + dy = 0 


have no solution except 

x = y = 0 

unless detA = 0. We thus see that the origin is the only critical point of the 
system (16.48) unless ad = be. 

Relevant terminology is given below. 


4 Secular equation: 

If (x(t),y(t)) is a solution of (16.48), then x(t) and y(t) satisfy the equa- 
tion: 

u" — (a + d)u' + (ad — bc)u = 0. (16.49) 

This equation is called the secular equation of the autonomous system 
(16.48). 


Proof The first equation of (16.48) says that 

by = x' — ax, 


which implies that 
We thus have 


// I if 

x — ax = by . 


x" — ax' = b(cx + dy) = bex + d(x' — ax), 


or equivalently, 

x" — (a + d)x' + (ad — bc)x = 0. 

The proof for y(t) is the same, replacing a with d and b with c. A 

The secular equation (16.49) has an important property associated with the 
nature of the critical point. This is seen by introducing the concept of the 

characteristic polynomial P of (16.49) as 


a — A b 
c d — A 


P = X 2 — (a + d) A + (ad — be) 


det(A-XI). 
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If A j (j = 1, 2) are the roots of P = 0, then there exist nonzero eigenvectors 
(xj,yj) such that 

f ab \f X A = f aXj + byA =x ( zA 

V c d ) \Vj ) \ cx j + d Vj J ' V !h ) ' 

From this, it follows that the functions 

are a basis of vector-valued solutions of (16.48). We shall see later that the 
nature of a critical point of a system is completely determined by the values 
of the roots Ai, A 2 . 


16.4 Classification of Critical Points 

The behavior of trajectories of a linear autonomous system 

= A 


d / u 
dt\v 


a b 
c d 


(16.50) 


near its critical point depends on the eigenvalues of the matrix A, denoted by 
Ai and A 2 . There are five cases to consider and we discuss each in turn. 


16.4.1 Improper Node 

We first consider the case where Ai and A 2 are real, unequal , and of the same 
sign. A critical point for this case is called an improper node. In this case, all 
the trajectories approach the critical point tangentially to the same straight 
line with increasing t. 

Example An example of improper nodes is given by 

1 (:) - ( o 2 -3) (:) ■ <^> 

The eigenvalues are obviously A = —3, —2 and the corresponding eigenvectors 
are (1,0) and (0, 1). The general solution to (16.51) is 

(“) =c, (o) e " , + C2 (?) e ~” I 16 - 52 ) 

The trajectories given by (16.52) for several values of Ci and C 2 are shown in 
Fig. 16.1. 
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Fig. 16.1. Trajectories associated with the improper node of the system (16.51) 


Remark. If the eigenvalues are real, unequal, and positive (contrary to the 
above example), then the trajectories are similar to those in Fig. 16.1 except 
that the directions of the arrows are reversed; in other words the trajectories 
recede from the critical point and go off toward infinity. 


16.4.2 Saddle Point 

We next consider the case where Ai and A 2 are real, unequal, and of the 
opposite sign. In this case, the trajectories approach the critical point along 
one eigenvector direction and recede along the other eigenvector direction. 
The critical point in this case is called a saddle point. 

Example Assume the system 

£(:)-( o”) CO- (i6 - 53) 

The eigenvalues are A± = — 1, 2, and the corresponding eigenvectors are (1, 0) 
and (1,3), respectively. The general solution to (16.53) is 

(“) =Cl (o) e " + ra 0) e ” (16 - 54) 

The trajectories given by (16.54) are shown in Fig. 16.2. As (16.54) consists 
of an e~ l term and an e 2t term, the trajectories approach the origin along the 
eigenvector direction (1,0) and recede along the direction (1,3) as t increases. 
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Fig. 16.2. Trajectories around the saddle point of the system (16.53) 


16.4.3 Proper Node 


We next consider the case of two roots of the characteristic equation being 
real and equal. This type of critical point is called a proper node. 

Example We consider the critical point of the system 

I (:)=(» “)(“)■ (i6 - 55) 

The critical point occurs at the origin, with the degenerate eigenvalue being a. 
Generally when the eigenvalue A of the characteristic equation is degenerate, 
the eigenvector is given by 

u(t) = (ci + c 2 t)e xt , v(t) = (c 3 + c±t)e xt . (16.56) 


Hence, we set A = a in (16.56) and substitute the results into (16.55) to obtain 
C2 = C4 = 0. The solution to (16.55) is thus 

u(t) = cie at , v(t) = C 3 e at . (16.57) 


Eliminating t from (16.57) yields the expression of the trajectories: 

v = — u if ci ^ 0 
Cl 


and 


u = 0 if Ci = 0, 


both of which are depicted in Fig. 16.3. The trajectories approach or recede 
from the origin, depending on the sign of a. 
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Fig. 16.3. Trajectories around the proper node of the system (16.55) 


16.4.4 Spiral Point 

So far, we have restricted our attention to cases where the two eigenvalues 
are real. Now we consider the case in which the two eigenvalues are complex 
conjugates of each other. The corresponding critical point is called a spiral 
point or a focus. 

Example An example for this case is 

d / u\ _ ( 0 

dt\v)~y 2 

The critical point is at the origin, and the eigenvalues are A± = —1 ± i with 
coresponding eigenvectors (1,1 =F *)■ The general solution to this system is 

(“)=« + 

The result represents a family of curves that spiral into the critical point as 
t increases. Real components of the solutions u(t) and v(t) given by (16.59) 
are plotted in Fig. 16.4. 


-_i) (:) . <*») 


16.4.5 Center 

The final class of critical points is called a center, for which the two eigenval- 
ues are pure imaginary. In this case, trajectories consist of a family of closed 
loops centered about the critical point. 
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Fig. 16.4. Trajectories for the spiral point of the system (16.58) 


Example Consider the system 


d_ 

dt 




(16.60) 


The eigenvalues and corresponding eigenvectors are A± = ±i and (1 ± i, — 1), 
and the general solution reads 


u 

v 


= Ci 



+ c 2 



(16.61) 


Figure 16.5 shows several trajectories for different values of ci and c 2 . All the 
trajectories represent periodic motion about the critical point. 


16.4.6 Limit Cycle 

Before closing this section, we have one more topic to discuss. Consider the 
system 


x' = x + y — x (x 2 + y 2 ) , 

y' = -x + y - y (x 2 + y 2 ) . (16.62) 

The only critical point is at the origin. Letting x = r cos 9 and y = r sin 0, the 
system (16.62) becomes 


r' = r (l — r 2 ) and O' = — 1. 


(16.63) 
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U 

Fig. 16.5. Periodic trajectories for the center of the system (16.60) 


The system (16.63) has a trivial solution of r = 1, 0 = — £ + const, which 
represents periodic clockwise motion around the unit circle. We can find other 
solutions by solving (16.63). The equation for r is easy to solve, yielding 



where r(0) = ro- Figure 16.6 shows r(t) plotted against t for ro > 1 and 
ro < 1. The trajectories spiral in toward the unit circle as t — > oo if ro > 1 



Fig. 16.6. Converging behavior of solutions of the system (16.62) to a limit cycle 
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and they spiral out toward the unit circle as t — > oo if ?’ 0 < 1. Hence, all the 
trajectories spiral into the unit circle as t — > oo. 

The unit circle mentioned above is called a limit cycle. Limit cycles are 
important for determining the stability of the system, since the existence of a 
limit cycle ensures the existence of periodic solutions to a system. 

Exercises 

1. Consider the system given by 

x' = e x + sin 5 y — cos 2 y, 
y' = x + 2siny. 

Find the equilibrium point and describe the stability of the system around 
the point. 

Solution: Expanding all functions on the right-hand side around x = 
0, y = 0, we have 


x' = x+5y + g(x,y), 
y' = x+2y + h{x,y). 

The functions g , h converge to zero faster than y/a; 2 + y 2 , and the 
characteristic equation becomes 


whose roots are (3± \/21)/2. Both of these are positive and the system 
is unstable. £ 


16.5 Applications in Physics and Engineering 

16.5.1 Van der Pol Generator 

As a physical example of a system in which a limit cycle may occur, we 
consider the following electric circuit consisting of a coil with inductance L 
and a condenser with capacitance C attached to a tunnel diode . A tunnel 
diode is a nonlinear element in the sense that it exhibits nonlinear current- 
voltage characteristics: 

I(V) = J 0 - a(V - Vo) + b(V - Vo) 3 - (16.64) 

It follows from Fig. 16.7 that a tunnel diode behaves like an ordinary resistor 
at low and high voltages, but like a negative resistor at intermediate voltages. 
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Fig. 16.7. Left: An electric circuit consisting of a coil with inductance L, a condenser 
with capacitance C, and a tunnel diode. Right: Plot of the nonlinear current-voltage 
characteristics I(V) of the tunnel diode 


Thus, a tunnel diode is expected to amplify small oscillations in the system, 
provided we choose the parameters in an appropriate manner. 

The equation of motion for the LC circuit attached to a tunnel diode is 
obtained as described below. The law of the conservation current flow 
ensures that 

II + I{V) + Ic = 0, (16.65) 

where 

II = j- [ Vdt and Iq = C (16.66) 
Ju J dt 

Substituting (16.64) and (16.66) into (16.65) and then differentiating with 
respect to time, we get 

" + ‘[- 0 + w _^] A + ^ = o . 

where we introduce the resonant frequency u>o defined by the equation uj g = 
1 /(LC). For simplicity, we define a new variable 

F-Fo 


for which 


with 


x — a(l — (3 x 2 )x + u> qX = 0 


x = 


dx 

dt 


(16.67) 


a-L 0=^. 

C a 
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Fig. 16.8. Trajectories for the Van der Pol equation (16.68) with the parameter 
fj, = 0.3 for (a) and /x = 3.0 for (b) 


Equation (16.67) can be further simplified by replacing x by x/y/]3 and intro- 
ducing a new time variable t = u>ot. Hence, we finally obtain 

x — n(l — x 2 )x + x = 0 (16.68) 

with the following key parameter: 



The nonlinear differential equation (16.68) is known as the Van der Pol 
equation. As shown below, it describes self-sustaining oscillations in which 
energy is supplied to small oscillations and removed from large oscillations, 
which gives rise to the limit cycle in the phase space. 

We can observe the self-exciting behavior of the system governed by (16.68) 
in the phase space plot in Fig. 16.8, where we set fj, = 0.3 and /x = 3.0 for 
various initial points xq = ( x(t = 0 ),x{t = 0)). We see that all the trajectories 
starting at a point Xq inside (or outside) a closed contour C move outward (or 
inward) as t increases and then converge to the contour C ; such a characteristic 
closed contour is known as the limit cycle of the system. The shape of the limit 
cycle depends on the value of as is evident from Fig. 16.8. 
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Partial Differential Equations 


Abstract Broadly speaking, there are three classes of partial differential equations 
that are relevant to mathematical physics, as reflected in the section titles of this 
chapter. After examining the basic properties common to all the abovementioned 
classes of equations, we devote the balance of this chapter to a discussion of the 
mathematical essence of each class. 


17.1 Basic Properties 

17.1.1 Definitions 


In this section we present the basic theory of partial differential equations 
(PDEs), an understanding of which is crucial is for describing or predicting 
the realm of nature. The formal definition is given below. 


4 Partial differential equations: 

A partial differential equation of order r is a functional equation of the 
form 


F 





du d 2 u \ 

’ dx n ’ dx\ ’ ) 


( 17 . 1 ) 


which involves at least one rth-order partial derivative of the unknown 
function u = u(x±,X2, • • • , x n ) of independent variables x\, X2, • ■ ■ , x n . 


In this chapter we often denote partial derivatives with subscripts such as 


du 

dx 


= u x 


d x u, 


d 2 u 

dxdy 


— Uxy — d x d y U, 




540 17 Partial Differential Equations 


We also use the shorthand 

d d 2 

d = _ _ d d = - 

3 dxj ’ 1 0 dxidxj ’ 

Then, the general form (17.1) of a PDE is expressed as 

F(x, Vi * j ti, 5 'U'xxt Hxyi ) — O 5 (17.2) 

where u = u(x, y,- • •) is the unknown function of independent variables 
x, y, ■ ■ ■ . A solution (or integral) of a PDE is a function ip(x, y, • • • ) 
satisfying equation (17.2) identically, at least in some region of the indepen- 
dent variables x, y, ■ ■ ■ . 

17.1.2 Subsidiary Conditions 

The general solution of (17.1) depends on an arbitrary function. This state- 
ment is valid even for higher-order PDEs, indicating that a PDE has in gen- 
eral many solutions. Hence, in order to determine a unique solution, auxiliary 
conditions must be imposed. Such conditions are usually called initial con- 
ditions on time or boundary conditions for positions. 


Initial condition: 

In physics, an unknown function in a PDE usually involves independent 
variables of time t and position x, y, ■ ■ ■ . Initial conditions for an unknown 
function are imposed on a particular (initial) time t = to for an unknown 
function and/or its time derivatives. 

Boundary condition: 

Boundary conditions are imposed for an unknown function at the bound- 
ary or the infinity of a domain D in which the PDE is valid and are classified 
into two cases: 

1. Dirichlet condition is the case in which an unknown function u is 
specified on the boundary of the domain D (often denoted by dD), 
where u is a function of time t and position x, y, ■ ■ ■ . 

2. Neumann condition is the case in which the normal derivative of an 
unknown function du/dn is specified. 


17.1.3 Linear and Homogeneous PDEs 

A PDE is called linear if and only if the F of (17.1) is a linear function of u 
and its derivatives. First we assume a first-order PDE with two independent 
variables x and y, whose general form reads 
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F(x,y; u,u x ,u y ) = 0. (17.3) 

Then, if it is linear, (17.3) can be expressed by 

a(x, y)u x + b(x, y)u v + c(x, y)u = g(x, y), (17.4) 

where a, b, c, and g are given functions of x , y. Using the operator L, we 
express (17.4) by a simple form such that 

Lu(x,y) = g(x,y), (17.5) 

where the operator L is defined by 

L = a(x, y)d x + b(x,y)d y + c(x,y). 

The linearity of PDEs guarantees that for any function u, v and any constant 
c the relations hold for 

L(u + v) = Lu + Lv, L(cu) = cL(u). 


Examples 


^ xx & ^ yy — U 

u xx ~ e ~ Xu yy = sinx 

uu x + u y = 0 

Xu x + yu y + u 2 = 0 


(linear) 

(linear) 

(nonlinear) 

(nonlinear) 


A linear equation is said to be homogeneous if the equation contains 
either the dependent variable u or its derivatives u x .u y . ■ ■ ■ , not an indepen- 
dent variable such as x, y, ■ ■ ■ . For instance, the PDE (17.5) is homogeneous if 


g(x,y) = 0, 


since the equation 


Lu(x, y) = 0 


(17.6) 


involves only u, u x , u y and not x or y. On the other hand, if g ^ 0 in (17.5), it 
is called an inhomogeneous (or nonhomogeneous) linear equation. These 
statements are generally valid even for higher-order PDEs. 


17.1.4 Characteristic Equation 

We consider a first-order homogeneous linear PDE of the form 

a(x, y)d x u(x, y) + b(x, y)d v u(x, y) = 0, (17.7) 

which is the most simple (and thus pedagogical) class of PDEs. In general, 
solutions of PDEs are described by arbitrary functions f(p) of a particular 
independent variable p, wherein p = p{x 1 y) is some combination of independent 
variables x and y. We verify this statement for the case of (17.7). 
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By the chain rule on the derivative, we have 


du dp df du dp df 

dx dx dp ’ dy dy dp 

Hence, the PDE (17.7) can be rewritten in the form 


a(x,y) 


dp 

dx 


b{x, y) 


dp] df_ 
dy \ dp 


= 0 . 


(17.8) 


(17.9) 


This implies that the function form of f(p) may be arbitrary if p = p{x 1 y) 
satisfies the equation 

a ( X,V ^^x +b ^ X,V ^^y = °' (17.10) 

Therefore, an arbitrary function f(p) such that the p satisfies (17.10) serves 
as the solution of the original PDE of (17.7). [The case of df /dp = 0 gives a 
trivial solution of f(p) = u(x,y) = const, which we omit below.] 

To obtain the solution p = p(x, y) of the equation (17.10), we tentatively 
suppose that the function p = p(x, y) takes a constant value along a curve 
C : y = y(x) on the (a:-y)-plane. Then, the total derivative of p on the curve 
C should vanish, so that 


, dp, dp 
dp = - 7 —dx + —dy = 0. 
ox oy 


(17.11) 


From the correspondence between (17.10) and (17.11), we see that these are 
equivalent provided that 


dy 

dx 


bjx.y) 

a(x,y) : 


a(x,y) ± 0. 


(17.12) 


This is called the characteristic equation of the PDE (17.7) and its solution 
y = y(x) is the characteristic curve of (17.7). From (17.11) and (17.12), 
therefore, we obtain the desired function form of p = p(x, y) that makes an 
arbitrary function f(p) the solution of the original PDE. 

Examples We evaluate a general solution for (17.7) in the case that a, b are 
constant and nonzero coefficients. From (17.11) and (17.12), we obtain the 
characteristic curve (line) p = bx — ay. Then a general solution takes the form 

u (x,y) = f(p) = f(bx-ay), (17.13) 


where f is an arbitrary function. The solution can be easily checked by tak- 
ing derivatives using (17.8) and substituting those into (17.7). A less trivial 
example is given in Exercise 1. 
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17.1.5 Second-Order PDEs 

The general form of second-order linear PDEs is 

n n 

E a i j(x 1 ,x 2 ,- ■ ■ ,x n )didjU + y ^a i (x 1 ,x 2 , ■ ■ ■ ,x n )diU 

i,j=l i= 1 

+a 0 (x 1 ,x 2 ,- ■ ■ ,x n )u = g(x i,x 2 ,- ■ ■ ,x n ), (17.14) 

where the unknown function u depends on n-independent variables denoted 
by xi,x 2 ,--- ,x n . Note that = aji since the mixed derivatives are equal. 
The form of (17.14) represents a very large class of PDEs. Among them, we 
restrict our attention to the case g = 0 with real constant coefficients, namely, 
second-order linear homogeneous PDEs. The general form of linear PDEs of 
second-order involving n independent variables with real constant coefficients 
is written as 

n n 

'y dijdidjii + y^ aidiU + clqu = 0. (17.15) 

i,j—l j= 1 

The linear transformation of independent variables x = {x\,x 2 , - • • ,x n ) 
to y = (j/i, y 2 , ■ ■ ■ , y n ) is given by 


y = Bx , 


(17.16) 


or equivalently, 


Vk = 


E 


bkrri'Eri 


where the bkm are elements of the n x n matrix 
the derivative, we have 


d _ dy k d 

dxj dxi dy k 


B. 


Using the chain rule on 


and 


d 2 u 

dxidxj 






u. 


(17.17) 


Hence, the first term of (17.15) is converted to 

n n 

^ ^ CLijdidjU — ^ ^ (bki^ij ^mj ) 

i,j = 1 k,m— 1 

which leads the relation 






k,m— 1 


(17.18) 
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Thus we obtain the PDE with new variables y \ , j/ 2 , • • • , y n by the transforma- 
tion A — > B t AB 1 where B f is the transpose of B. 

The appropriate choice of the matrix B makes it possible to diagonalize 
A such that 

B*AB = 


( Cl 


c 2 0 


0 


V Cn ) 


where c\, c 2 , • • • ,c n are the real eigenvalues of the matrix A. Thus, any PDE 
of the form (17.15) can be converted into a PDE with diagonal coefficients in 
terms of a linear transformation of a set of independent variables such as 


E 



n o. 

X>|^ = o. 

^ oy l 


i—1 


(17.19) 


4 Theorem: 

By linear transformation of independent variables, the equation (17.15) 
can be reduced to the canonical form (17.19). 


17.1.6 Classification of Second-Order PDEs 

We can classify the types of PDEs depending on the positive or negative values 
of the coefficients ci, c 2 , ■ ■ • , c n in (17.19) for the case di = 0. 

1. Elliptic case: 

If all the eigenvalues ci,c 2 ,--- , c„ are positive or negative, the PDE is 
called elliptic. A simple example is given by 

d 2 3 u d 2 u 

dy{ + M + '" = ' 

2. Hyperbolic case: 

In this case none of the {cj} : i = 1, 2, • • • , n vanish and one of them has 
the opposite sign from n — 1 than the others. For example, 

d 2 u d 2 u 

M^dy 2+ '" = ' 

3. Parabolic case: 

If one of the { } , i = 1, 2, • • • , n is zero and all the others have the same 
sign, the PDE is parabolic. 
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Below are basic PDEs in physics classified by the definition given above: 


Laplace equation: A n u = 0, (17.20) 

Wave function: it tt = A n u , (17.21) 

Here A n means the Laplacian defined by A n = d\ + <9f + • • • + d\. The other 
important equation takes the form 

u t = A n u, (17.22) 

which we call the diffusion equation. The diffusion equation is different 
from the wave equation, where the time reversal symmetry t — ■> — t holds. All 
of these equations (17.20)-(17.22) are linear since they are first degree in the 
dependent variable u. 


Exercises 

1. Find a general solution of the PDE of u = u(x, y) given by 

u x + 2 xy 2 u v = 0. (17.23) 

Solution: The characteristic equation of (17.23) reads dy/dx = 

2 xy 2 , which has the solution y = l/(p — x 2 ). Hence, we have 
p = x 2 + (1 /y) 1 i.e., the general solution is given by 

u(x,y) = f (x 2 + . 

In fact, u(x, y) is a constant on the characteristic curve y = 1 /{p— 
x 2 ) whatever value p takes, as proved by 

i u { 

and similarly we have dufdy — 0. A 

2. Classify second-order PDEs in two independent variables whose general 
form is given by 

d 2 x u + 2a 12 d x d v u + a 22 d 2 u = 0, (17.24) 

where ai 2 , a 22 are real constants. 

Solution: By completing the square, we can write (17.24) as 

{d x + a 12 d y ) 2 u + (a 22 - a\ 2 )d 2 y u = 0. (17.25) 


p — x‘ 


du 2x du du 2 9u 

dx^~ (p — x 2 ) 2 dy dx^~ dy 
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Here, let us introduce the new variables z and w by the linear 
transformation of the form x = z, y = a^z + (022 — a\ 2 ) l ^ 2 w. We 
then have 

d d d d 2 3 

W- = W 1- &12 7j ; W— = (022 - ^12) ' 

oz ox ay ow oy 

so that for the case (I 22 > af 2 (17.25) gives 

d 2 u d 2 u 
dz 2 dw 2 

This is the elliptic case and is called the Laplace equation in the 
2 w-plane. We easily see that for (17.25) the hyperbolic case is ob- 
tained for 022 < di 2 - Thus, the second term of (17.25) determines 
the types of PDEs. X 


17.2 The Laplacian Operator 

17.2.1 Maximum and Minimum Theorem 

We describe the fundamental properties of three operators, the Laplace, dif- 
fusion, and wave operators. These three operators are of great importance in 
the theory of PDEs. We begin with a description of the Laplace operator 
(or simply Laplacian) A n on R” defined by 

n 

An = Y. d l 

i=l 

where n is a positive integer. The Laplacian is not only important in its 
own right, but also forms the spatial component of the diffusion operator 
Ld — dt — A n and the wave operator Lw = d 2 — A n , whose properties are 
discussed in Sect. 17.3 and 17.4. 

First, we explain the maximum principle for the Laplace equation 
given by 

A n u(x ) = 0, 

whose solutions are called harmonic functions. Obviously, the one-dimen- 
sional case (n = 1) is trivial, so we consider the case where n > 1. Let 
D be a connected open set and u be an harmonic function in D with sup 
u(x) = A < 00 for x £ D. 

X Maximum and minimum theorem: 

The maximum and minimum values of u are achieved on dD, say the 
boundary of D , and nowhere inside. 
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Before going to the proof, we examine certain properties of the solutions of 
Poisson’s equation expressed by 

A n u(x) = —4irp(x). (17.26) 


4 Lemma: 

If the function p(x) in Poisson’s equation (17.26) is positive (or negative) 
at a point a? 0 j then the solution of (17.26) cannot attain its maximum (or 
minimum) value at the point Xq. 


Proof (of the lemma): If the function u(x) satisfying (17.26) at- 
tains a minimum at a point xo, then it should attain a minimum with 
respect to each component xi,X 2 ,--- ,x n separately at that point. 
Then all the second-order derivatives of u would have to be non- 
negative, which means that the left-hand side of (17.26), i.e., the 
sum of the second-order derivatives would have to be nonnegative. 
This result contradicts our hypothesis that p(x) in (17.26) is positive. 
Hence, the first part of the lemma has been proved. The second part 
of the lemma is proved in a similar manner by assuming that p(x) is 
negative, ft 

We are now ready to verify the maximum and minimum theorem. 

Proof (of the maximum and minimum theorem): The proof is 
by contradiction. We first assume that 

u(Xq) > U b + £, 

where u b is the value of the function u(x) at an arbitrary point on the 
boundary of the defining domain D. We further assume the function 

v(x) = u(x) + pr(x) 2 , 


where 


r(x) 2 = \x - £C 0 | 2 


and 77 is some positive constant. It then follows that 


A n v = A n u + 2nri = 2 nr), 

which says that the v(x) is a solution of Poisson’s equation (17.26) 
with negative p(x). Note that v(x 0 ) = u(x 0 ) and by hypothesis 

u(x 0 ) > u b + £ = v b + £ — r/r 2 . 
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Choosing g to be so small that throughout D 

2 £ 

£ - w > 


we obtain 

v(x 0 ) > v b + 

which implies that v attains its maximum somewhere within the do- 
main D. This clearly contradicts the lemma above, so our assumption 
at the beginning of the proof was false. X 


17.2.2 Uniqueness Theorem 

The following theorem establishes the uniqueness of the solution of the 
Dirichlet problems for the Laplace equations. 

4 Uniqueness theorem: 

If it exists, the solution of the Dirichlet problem for a Laplace equation 
is unique. 


Proof Suppose that u± and w 2 are solutions on D for the Dirichlet problem 
such that 

A n u = f(x) in D, 
u = g(x) on dD. 

Let w = U\ — tt 2 , then A n w = 0 in D and w = 0 on dD. By the maximum 
(or minimum) principle, the point x rn (or xm) that minimizes (or maximizes) 
w(x) should be located on the boundary of D. Hence, we have 

0 = w{x m ) < w(x) < w(xm ) = 0 

for all x £ D, which means that w = 0 and u\ = u 2 - X 

17.2.3 Symmetric Properties of the Laplacian 

The Laplacian is invariant under all rigid transformations such as translations 
and rotations. A translation from x to a new variable x' is given by 

x! = x + a, 

where a is a constant vector in n-dimensional space. The rotation is expressed 

by 


x' = Bx, 


(17.27) 
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where B is an orthogonal matrix with the property BB 1 = B*B = I. 
Invariance under translations or rotations means simply that 


E 


a 2 

dxl 


E 


3 = 1 


d 2 

dx 1 ' 


The proof for translational invariance is simple, so we leave it to the reader. 
In physical systems, translational invariance is apparent because the physical 
laws are independent of the choice of coordinates. 

A rotational invariance under (17.27) is proved by using the chain rule on 
the derivative such that 


n 


E 


a 2 

dx 2 


El El bikbjk 

ij 


d 2 

dx'/Jx'j 


E 


d 2 

dx'idx'j 


E 


d 2 


dx'V 


where we have used the relation 

n 

'y ' hkbjk — ( BB ) ij = Sij . 
k 

Thus the proof has been completed. 

Rotational invariance suggests that a two- or three-dimensional Laplacian 
should take a particularly simple form in polar or spherical coordinates. 


Exercises 


1. Find the harmonic function for a two-dimensional Laplace equation that 
is invariant under rotations. 


Solution: The two-dimensional Laplacian in polar coordinates is 
given by 


^2 


d 2 ld_ 

dr 2 r dr r 2 d q D 2 > ’ 


where we seek for solutions u(r) depending only on r. Then we 
take the radial part of the Laplace equation, which gives u rr + 
i u r = 0 (r > 0). This is the ODE whose solution is given by 
u(r) = alogr + b (r > 0), where a, b are constants. Note that the 
form of the function log r is scale invariant under the dilatation 
transformation r — > cr for a positive constant c. X 


2. Find the harmonic function in three dimensions that is invariant under 
rotations. 
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Solution: The Laplacian in spherical coordinates takes the form 

3 dr 2 r dr~^~ r 2 sin $ dd \ 9$ 

Since the solution depends only on r, we have the Laplace equation 
given by u rr + 2 u r = 0 (r > 0). So we have ( r 2 u r ) r = 0 and the 
solution becomes u = ^ + b (r > 0), where a , b are constants. 
This is an important harmonic function that is not finite at the 
origin, ft 


1 


9 2 


r 2 sin 2 $ d<f> 2 


(r > 0). 


3. Show that, for an arbitrary integer n > 2, the general form of solutions 
with rotational symmetry is given by 

u(r) = ar 2 ~ n + b (n>2,r>0), (17.28) 


where a, b are constants. 

Solution: This is shown by applying the chain rule to the deriva- 
tive such that 


A n u{r) = ^ 9 ; yu'(r) = ^ 


i—1 


i=l 


%«"(r) + -u'(r) - %u'(r) 


= u"{r) + -w'(r), 


(17.29) 


where the relation dr/dxi = Xi/r is used. If A n u = 0, (17.29) 
yields 

u”(r) 1 — n 

u'(r) r 

Integrating twice, we have (17.28). £ 


17.3 The Diffusion Operator 

17.3.1 The Diffusion Equations in Bounded Domains 

The diffusion equation describes physical phenomena such as Brownian 
motion of a particle or heat flow, whose general form is written as 

L d u(x 1 ,x 2 , ■ ■ ■ ,t) = 0, (17.30) 

where Lp is the diffusion operator defined by 

n 

L D =d t ~Y J di. (17.31) 

i= 1 

If the scale transformation t —> Dt is used, we have the diffusion equation 
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d t u — DAu = 0, 

where D is the diffusion constant. For heat flow u represents the tempera- 
ture at position x = {x\, X 2 , ■ ■ ■ ) and time t, and for Brownian motion u is the 
probability of finding a particle at x and t. Hereafter, we treat the system of 
the unit diffusion constant D = 1. If we have to go back to the actual diffusion 
equation, we do the transformation t — » Dt in the final solution. 

17.3.2 Maximum and Minimum Theorem 

We begin by describing the maximum principle for the diffusion equation 
defined in a bounded domain, from which we deduce the uniqueness of initial 
and boundary value problems. 

4 Maximum and minimum theorem: 

Let D be a bounded domain in R" and 0 < t < T < oo. If u is a real- 
valued continuous function, it takes its maximum either at the initial value 
(t = 0) or on the boundary dD. 


Proof For any e > 0, we set 

v(x, t) = u(x, t) + £ \x\ 2 , 


for which we have 

Vt — A n v = —2 ne < 0. (17.32) 

If the maximum for u occurs at an interior point (x 0 , t 0 ) in the domain 
D x [0, T], we know that the first derivatives Vt,v Xl , v X2 , • • • of v vanish there 
and that the second derivative Av < 0. This contradicts (17.32), so there is 
no interior maximum. Suppose now that the maximum occurs at t = T on D ; 
the time derivative vt must be nonnegative there because 


v(x 0 ,T) > v(x 0 ,T - (5) 


and 

Av < 0, 

which again contradicts (17.32). Therefore, the maximum must be at the 
initial time t = 0, namely, D x {0} or the boundary dD. Replacing u by —u, 
we see that the minimum is also achieved on either D x {0} or dD. Jit 


17.3.3 Uniqueness Theorem 

The maximum principle can be used to prove uniqueness for the Dirichlet 
problem for the diffusion equation. The conditions are given by 
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L d u= f(x,t) on Dx (0,oo) 
u(x, 0) = g(x), u(x,t) = h(t) on dD 

for given functions f,g , and h. 

The following is an immediate corollary of the maximum and minimum 
theorem. 


4 Uniqueness theorem: 

There is at most one solution of the Dirichlet problem for the diffusion 
equation. 


Proof Let u(x, t) and v(x, t) be two solutions of (17.33) and w = u—v be their 
difference. Hence, we have Lpw = 0, w(x,0) = 0, w(0,t) = 0, w(x,t ) = 0 
on dD. By the maximum principle, w(x,t ) has its maximum at the initial 
time or the boundary, exactly where w vanishes. Thus w(x,t) < 0. The same 
reasoning for the minimum shows that w(x,t ) > 0. Therefore w(x,t) = 0, so 
that u = v for all t> 0. A 


17.4 The Wave Operator 

17.4.1 The Cauchy Problem 

The wave operator (or d’Alemberian) on R" x R is expressed by 

n 

L = d 2 t -A n = d 2 t ~Y. d ^ ( 17 - 33 ) 

i= 1 

from which we have the wave equation in the general form 

Lu = d 2 t u - A n u = 0. (17.34) 

The wave equation is the prototype of the hyperbolic PDEs and describes 
waves with unit velocity of propagation in homogeneous isotropic media. By 
making the transformation t — * cf, we have the standard form of the wave 
equation 

d 2 u — c 2 A n u = 0, (17.35) 

where c is the wave velocity. The solution for (17.35) is obtained by trans- 
forming the time variable t into ct in the result of (17.34). 

The initial value problem for the wave equation is called the Cauchy 
problem and is given by the inhomogeneous wave equation 

d 2 u{x,t) - A n u(x,t) = f(x,t) 


(17.36) 
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under the two initial conditions 


u(x, 0) = (j>(x), 
d t u(x, 0) = ip(x), 

where /, </>, if) are continuous and differentiable given functions. For example, 
f(x,t) provides an external force acting on the system described by (17.36). 

The wave operator (17.33) is a linear operator, so the solution is the sum 
for the general solution of the homogeneous equation (17.34) and a particular 
solution for the inhomogeneous equation (17.36). 


17.4.2 Homogeneous Wave Equations 


First, we provide the solution for the one-dimensional homogeneous version 
(/ = 0) of the Cauchy problem (17.36), in which the spatial part is defined on 
the whole region of one dimension — oo < x < oo. For example, consider the 
case of an infinitely long vibrating string. The wave equation is written as 

u tt -u xx = 0, (17.37) 


which is a hyperbolic second-order PDE that we can express by 

f a d \ f d d\ 

Vat dx) \dt + dx) u ~ ' 

Let us set 

u t + u x = v, 

then the first-order PDE for v(t,x) is obtained from (17.38) as 

v t ~v x = 0. 

As shown earlier, (17.40) has a solution of the form 

v(x,t) = g(x + t), 


(17.38) 

(17.39) 

(17.40) 


(17.41) 


where g is any function. Thus we must solve (17.39) for u , which is given by 

u t + u x = g(x + t). (17.42) 


One solution of (17.42) takes the following form: 


u(x, t) = h(x + t), 


(17.43) 


which we can check through direct differentiation of (17.43) by settings = x+t 
such that 

du dh dp , 

dx dp dx 
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du dh dp , 

dt dp dt 

Then we have 

hip) = \ J g{p)dp. (17.44) 

Another possibility is the general solution of the homogeneous equation 
obtained by setting g = 0 in (17.42), which takes the form 

u = z(x — t). (17.45) 

Adding this to (17.44), we have the general expression of a solution, 

u( x, t) = h{x + t) + z( x — t). (17.46) 


Now let us solve (17.46) under the initial conditions 

u( x, 0) = 4>(x), 

Ut(x, 0) = 4>{x), 

where <f> and ip are given functions of x. From (17.46), we have the relations 


cf>(x) = h(x) + z(x), (17.47) 

ip( x) = h'{ x) — z'{x). (17.48) 

By differentiating (17.47), we obtain <j>' = h'+z' . Combining this with (17.48), 
we have 

h' = + z! = ^ (ft - ip). 

Integrating on p yields 

1 1 fP 

yip) = 2 ^ip) + 2 J ^ dp + a > 

1 1 fP 

zip) = ^ip) + 2 J 0 ^ dp ~ a ' 

So, we get 

1 1 f x+t 

uix , t) = - [(j)ix + t)+ <j>ix ~t)]+ - J ip ip)dp , (17.49) 

which is the solution for the initial value problem for the homogeneous equa- 
tion (17.37). 
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17.4.3 Inhomogeneous Wave Equations 

Next we solve the initial value problem for an inhomogeneous PDE (/ ^ 0) 
by applying the method of characteristic coordinates. We transform the 
variables x, t into new variables £ = x + t, r] = x — t. The wave equation for 
the new variables yields 


d v d^u 


1 f g + ?7 

4 ^ \ 2 ’ 2 j ' 


This equation can be integrated with respect to 77 , leaving £ as a constant. 
Thus we have 

H = - \ j Mb (17.50) 

where the lower limit of integration is arbitrary. Again we can integrate with 
respect to 


u(£,v) = ~\ 




(17.51) 


Here we consider the dependent variable u at a fixed point (£ 0 , Vo) defined 
by 

£o = x 0 + t 0 , vo = x 0 -t 0 - (17.52) 

We can evaluate (17.51) at the point (£o>A*o) and make a particular choice of 
the lower limits such that 


u(to,Ho) = \ [ f fdvd£. 

^ J — 00 J rjo 

Here we change the variables £, 77 into the original ones {x, t), and the Jacobian 
is the determinant of its coefficient matrix: 


J = det 


dx dt 
di) dv 
dx dt 


= 2 . 


Thus d^drj = Jdxdt = 2 dxdt, so the double integral can be transformed as 


u(x 0 ,t 0 ) 



l-Xo + ito—t) 
/ X 0 — (to—t) 


f(x, t)dxdt 


As a result, we have the following theorem: 
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6 Theorem: 

The unique solution of (17.36) on one spatial dimension is given by 
u(x, t) = ^ [4>(x + t) + ip(x - t)] 

-r rx+t -i pt rx+(i—t') 

+ 0 / 'tp(p)dp + - / / f(x',t')dx'dt', 

^ J x—t ^ JO J x— (t— t') 

where 4>(x) = u{x, 0) and i/}(x) = Ut(x, 0). 


17.4.4 Wave Equations in Finite Domains 

In this section we attempt to solve the wave equations defined in the region 
D x (0, oo), where D is the bounded domain of R n . For this problem, we have 
to specify the initial conditions at t = 0 as well as some boundary conditions 
on dD. As noted in Sect. 17.1.2, the commonly used boundary conditions are 
the Dirichlet and Neumann conditions. First we treat a homogeneous wave 
equation with no external term given by 

d 2 t u-A n u = 0, (17.53) 

where the initial condition is 

u(x, 0) = f(x), d t u(x, 0) = g(x), (17.54) 

and the boundary conditions on dD are given by 

u(x,t) = 0 or d n u(x,t) = 0. (17.55) 

Thus, when the boundary conditions are independent of t, the method of 
separation of variables is useful, i.e., we assume that the solution u takes 
the form 

u(x,t) = X(x)T(t), (17.56) 

where X satisfies the boundary conditions (17.55) on dD. Substituting (17.56) 
into (17.53), we have 


AX(x) _ T"(t) _ 2 

X(x) T{t) M 


(17.57) 


This defines a quantity /x 2 that must be constant since AX/X depends only 
on x and T" /T depends only on t. The reason for the positive constant fi 2 > 0 
will be seen later. 

Equation (17.57) gives a pair of separate differential equations for X(x) 
and T(t): the one equation is 
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AX(x) = -fi 2 X(x) (17.58) 

that satisfies the given boundary conditions of (17.55), and the other is 

T" (t) = -fi 2 T{t) (17.59) 

in which 0 < t < oo. The solution for this ODE is obtained as 

T(t) = a cos fit + b sin /it, (17.60) 

where a and b are constants that can be determined from the initial conditions. 
Combining (17.60) with X(x), the solution is expressed as 

u(x,t ) = X fl (x)(a cos fit + b sin /it). (17.61) 

This is a normal mode of vibration with eigenfrequency ft and the general 
solution is obtained by the superposition of normal modes. Thus, we have 
the general solution of the form 

u(x,t) = ^ X n (x)(a n cos n n t + b n sin fi n t). (17.62) 

n 

For example, from the initial conditions in (17.54), we have 

^ ^ Q"nXn — f 5 ^ ^ f^n^n-^-n — 9 ? 

n n 

so the coefficients in (17.62) are given by 

= { f | X n } , b n (g | F n ) / fin- 


Exercises 


1. Find the general solution for the wave equation defined on the one- 
dimensional bounded domain (0,/) x (0,oo), which is given by 

d 2 u — Au = 0, 


under the conditions: u(x, 0) = /(&), dtu{x, 0) = g(x), u(0,t) — 
u(l, t ) = 0. 


Solution: The normalized eigenfunctions are X n = y | sin 
and the associated eigenfrequencies fi n are the integer multiples 
of the fundamental frequency n/l. Thus, we obtain 


v-^ / nirt . nirt\ . mrx 

i{x,t) =2^1°™ cos ~j b n sin — — 1 sm — — , 


where the coefficients are 


f(x) sin — — dx, b n = — / g(x)sin—dx. X 
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2. The differential equation under a point source r = 0 at time f 0 = 0 in an 
infinite medium is given by 


d t G - A 3 G = 5(r)6(t). 


(17.63) 


Find the solution G(r, t) called Green’s function by means of Fourier 
and Laplace transforms. 


Solution: If we take a Fourier transform in space and a Laplace 
transform in time, (17.63) becomes G(k,u ) ) = 1/[(27t) 3 (w + k 2 )]. 
The inverse Laplace transform yields G(k,t) = e _fc * /(2 7r) 3 . We 
then obtain Green’s function G(r,t) by the spatial Fourier trans- 
form given by 


G(r,t) = 


1 


(2tt) 3 , 

Integrating over the angles of r yields 


e -k t e ik-r d 3 k . 


G(r,t ) 


1 


— k z t 


2 . sin kr 


( 2^) 2 Jo 

1 T 19 

lm - — 

47rr i dr 


k 2 dk = 


kr 

„ — k 2 t Akr 


47 rr 


-Im 


e~ kH e ikr kdk 


dk , 


which gives Green’s function in the form G(r, t) = e r2 / 4t / (47rf) 3 / 2 , 
t > 0. For t < 0, G(r, t) = 0. A 


3. Find the half-space one-dimensional Green’s function defined on x > 0 
that satisfies the boundary condition of G(x,t) = 0 at xo = 0. 

Solution: Using an image source of negative strength at x = — xo, 
the solution is expressed by G(x, t) = Go {x — xo ,t) — Go(x + xo,t), 
x > 0, where Go(x,t) = e — a ’ 2 / 4 */ (47 rf) 1 / 2 . 4k 


4. Consider the wave equation with a source term h(r,t) given by 


dtf~ 2l 3 / = h(r , f). 


Show that the solution is expressed as 

/(r ’^ = h j h ^' t) \r r ~k r f r - (17 ’ 64) 
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Solution: The Green function G(r, t) is defined as the solution sat- 
isfying the equation d 2 G — A 3 G = 6(r)S(t). The spatial Fourier 
and temporal Laplace transform of the above is obtained by set- 
tings uj — » — u> 2 in Exercise 1 as G(k,u>) = 1 / [(27 r) 3 (— w 2 + k 2 )]. 
The inverse transform of the above gives G(\r — r'\,t — t') = 
(5(|? — r'\)S(t — t')/(47r|r — r'|). Since the physical system is invari- 
ant under the translations in space and time, the Green function 
depends only on relative space and time coordinates |r — r'\ and 
t — if . The Green function has the property that the solution can 
be written as f(r,t ) = f G(r, r’\ t, t')h(r' , t')d 3 r'dt, so it is given 
by (17.64). X 

6. Find the general solution for the wave equation 

d 2 u — A 3 u = 0, 

where A 3 is the Laplacian defined by 

A 3 = d 2 + d 2 y + d 2 . 

Solution: Since the system is isotropic and homogeneous, we can 
assume the solution in analogy with the one-dimensional case as 
f(p) = f(n-x±t ), where n is the unit vector that points along the 
direction of propagation of the wave and nx = Ix+my+nz. From 
the chain rule on derivatives, we have ( l 2 +m 2 +n 2 — = 0. 

Thus, / can be arbitrary since l 2 + m 2 + n 2 = 1. The general 
solution is given by u(x, t ) = f(n ■ x — t) + g ( n • x + t), where n 
is any unit vector. X 


17.5 Applications in Physics and Engineering 

17.5.1 Wave Equations for Vibrating Strings 

In the previous sections, we rigorously studied the theories underlying the 
following three typical classes of partial differential equations: Laplace equa- 
tions, diffusion equations, and wave equations. In this section, we attempt 
to formulate mathematical expressions for these classes on the basis of the 
associated physical phenomena; e.g., we will see that the mathematical form 
of the wave equation 

d 2 u(x,t) = v 2 d 2 u(x,t) (17.65) 

is derived by considering the wavy motion of a that string. This will make 
clear why equations of the same form as (17.65) are called wave equations. 
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Fig. 17.1. Schematic of a thin stretched string and a line element /As; the tensions 
exerted at both ends of the line element have equal magnitudes t but act in different 
directions 


Suppose that a thin string is stretched between two fixed points with a 
tension r exerted at the two end points. We assume that the string is perfectly 
flexible, i.e., only tensile forces can be transmitted in the tangential direction. 
Then, as illustrated in Fig. 17.1, the magnitude of the tension exerted in the 
tangential direction is the same for every part of the string. It can be seen 
from the figure that the vertical components of the tension r at the two ends 
of a line element with length As are — rsin^i and rsin02 5 where r = |r|. 
Hence, the vertical component r„ of the external force exerted on the line 
element is given by 


t u = t (sin O 2 — sin 9\) . (17.66) 

The sine terms are rewritten in terms of the derivative d x u by using the 
following approximations: 


du dx 1 

sin 0i = „ = o x u = a x u 

d ‘ ds {1 + 


— d~ 


1+2 ( px u ) + ' 


— d T 


where the relation ds = \J (dx) 2 + ( du ) 2 was used, and 

. di i{x + Ax) du( x + Ax) dx du(x + Ax) 

sm " 2 = Si = Si di- — Si — 

= d x u+dlu\ r Ax, 

where £ is a constant satisfying the condition x < £ < x + Ax. (The mean 
value theorem ensures the existence of such a constant £.) Substitution of 
the sine terms in (17.66) yields 
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T U = T d 2 x u\ x=t _ Ax. 

Since the inertial force exerted by the line element is given by 

pAsd 2 u , 

where, ( p is the line density) we obtain the equation of motion for the string 
in the vertical direction: 

A* s „ 2 I 

pAx d t u = Td * u l*=£- 

Taking the limit Ax — > 0 so that £ — ■» x, and then approximating ds/dx as 1, 
we obtain the final result: 

d 2 u= T -d 2 t u. (17.67) 

It is customary to denote the positive constant as t / p, v 2 , which allows us to 
write (17.67) in the familiar form (17.65). 

17.5.2 Diffusion Equations for Heat Conduction 

In this subsection, we attempt to derive mathematical expressions for diffu- 
sion equations by considering the physical phenomena that occur during heat 
conduction, i.e., the flow of heat in a certain medium from points at a high 
temperature to those at lower temperatures. This process takes place in such 
a manner that molecules in irregular motion exchange their kinetic energy by 
colliding with each other. 

We aim to determine the amount of heat SQ penetrating an arbitrarily 
chosen surface element SS inside the medium per unit time (called the heat 
flux). In order to find SQ, we consider another surface element <5 Si, which 
has the same magnitude as SS, parallel to SS and located at an orthogonal 
distance An from SS. We assume that SS is so small that the temperatures 
u = u(x, y, z, t) on SS and rq = u{x + Sx, y + Sy, z + Sz, t) on SS ± are constant 
over SS and SS i, respectively. 

From thermodynamics, we know that the magnitude of the flux of heat 
difference between u and iq, denoted by 5u, and the area of the surface element 
are related in the following manner: 


SQ ■ Sn = k ■ Su ■ SS, (17.68) 

where the value of the constant k, called the thermal conductivity, depends 
on the medium. Dividing both sides of (17.68) by Sn and taking the limit 
Sn — > 0, we have 

SQ = • SS. 

on 

Here, du/dn is the derivative in the direction normal to 6S and is expressed 


as 
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du _ _ 

— = n • Vu, 
on 

where n is the unit vector normal to SS. Thus, we obtain the flux passing 
through a volume element Sv that is enclosed by a surface S: 


Q = k I I = k 


Vu • ndS 


= K 


V • (Vu)dV. 


We now apply the mean value theorem to the volume integral over Sv to 
obtain 

Q = kV • [Vu(ar*, y* , z*,t)] Sv, (17.69) 

where (x*,y*,z*) is a point in Sv. 

Apart from the above-mentioned discussion, we also see that Q is related 
to the temperature variation in Sv with time. In fact, the temperature u in Sv 
increases (or decreases) owing to the accumulation (or loss) of heat in Sv at a 
rate of du/dt. Therefore, the flow of heat into (or out of) Sv can be written 
as 

(17.70) 

where p is the mass density and a (the specific heat) is a characteristic of 
the medium. By setting (17.69) equal to (17.70), and allowing the volume 
element Sv to shrink to a point, we have 


padtu = kV • (Vit) = kS7 2 u. 


Clearly, this result is of the same form as a diffusion equation that describes 
heat conduction in a medium with physical parameters p, cr, n. 



Part VI 


Tensor Analyses 
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Cartesian Tensors 


Abstract Tensors are geometric entities that provide concise mathematical frame- 
works for formulating problems in physics and engineering. The most important 
feature of tensors is their coordinate invariance: tensors are independent of the type 
of coordinate system chosen. This feature is similar to the condition that the length 
and direction of a geometric figure do not change, regardless of the coordinate sys- 
tem used for the algebraic expression. In contrast, the components of a tensor are 
coordinate-dependent in a structured routine. In this chapter, we discuss the ways 
in which the choice of a coordinate system affects the components of a tensor. 


18.1 Rotation of Coordinate Axes 

18.1.1 Tensors and Coordinate Transformations 

A tensor is a natural generalization of a vector or a scalar encountered in 
elementary vector calculus. The latter two are, in fact, both special cases of 
tensors of order n, whose specification in a three-dimensional space requires 
3" numbers, called the components of the tensor. In this context, scalars are 
tensors of the zero order with a 3° = 1 component and vectors are tensors of 
the first order with 3 1 = 3 components. 

Of importance is the fact that a tensor of order n is much more than 
just a set of 3" numbers. The key property of tensors is adherence to the 
transformation law of its components under a change of coordinate system, 
say, from a rectangular to elliptic, polar, or other curvilinear coordinate sys- 
tem. If the coordinate system is changed to a new one, the components of 
a tensor change according to a characteristic transformation law. We shall 
see that this transformation law makes clear the physical (or geometrical) 
meaning of the tensor being invariant under a change of coordinate system. 
The coordinate-invariance-clraracter of tensors answers the demand that the 
proper formulation of physical laws should be independent of the choice of 
coordinate systems. 
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It is obvious that physical processes must be independent of the coordinate 
system. However, what is not so trivial is what the coordinate-independence 
property of physical processes implies about the transformation law of math- 
ematical objects (i.e., tensors). The study of these implications and the classi- 
fication of physical quantities by means of the transformation laws constitute 
the primary content of this chapter. Emphasis is placed on the fact that all 
kinds of tensors are geometric objects whose representation (i.e., The values of 
its components) obey a characteristic transformation law under coordinate 
transformation. 


18.1.2 Summation Convention 

In order to simplify subsequent notation, we introduce the following two con- 
ventions: 


4 Summation convention: 

When the same index appears repeatedly in one term, we carry out a 
summation with respect to the index. The range of summation is from 1 
to n, where n is the number of dimensions of the space. 

4 Range convention: 

All non repeated indices are understood to run from 1 to n. 


These conventions are operative throughout this chapter unless specifically 
stated otherwise. 

Example The summation convention yields the new notation as 

n 

cabi = ^2 a,ibi = aibi + a 2 b 2 H b a n b n . 

2—1 

Similarly, if i and j have the range from 1 to 2, then 

a ijbij = a ljb\j + <^2jb2j 

= a\\bii + ai2&12 + 021^21 + 022^22, 

where it does not matter whether the first sum is carried out on i or j. 

Remark. Repeated indices are referred to as dummy indices since, owing to 
the implied summation, any such pair may be replaced by any other pair of 
repeated indices without changing the meaning of the mathematical expres- 
sion. 
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18.1.3 Cartesian Coordinate System 

A tensor is a mathematical object composed of several components. The values 
of the components depend on the coordinate system to be employed, so are 
altered through a coordinate transformation even when the tensor itself re- 
mains unchanged. Among the many possible choices of coordinate transfor- 
mations, a rigid rotation of a rectangular Cartesian coordinate sys- 
tem is the simplest. The remainder of this section is devoted to explaining 
the basic properties of the simplest coordinate transformations, as a prelimi- 
nary for our subsequent study of tensors in terms of more general coordinate 
systems. 

We begin with two formal definitions: 

4 Cartesian coordinate system: 

A Cartesian coordinate system associates a unique ordered set of real 
numbers (coordinates) (xi,x 2 ,--- ,x n ) with every point in a given n- 
dimensional space by reference to a set of directed straight lines (coordi- 
nate axes) Oxi,Ox 2 , , Ox n intersecting at the origin O. 

A Rectangular Cartesian coordinate system: 

If the three axes of a Cartesian coordinate system are mutually per- 
pendicular, we have what is called a rectangular Cartesian coordinate 
system. 


Figure 18.1 is a schematic illustration of a rectangular Cartesian coordinate 
system in three-dimensional space. Referring to this coordinate system, we 
denote the triples (1,0,0), (0,1,0), (0,0,1) by e\, e 2 , e 3 , respectively. These 
triples are represented geometrically by mutually perpendicular unit arrows. 

The set of Cartesian axes O x\, Ox 2 , and Ox 3 is said to be right-handed 
if and only if the rotation needed to turn the ad-axis into the direction of 
the x 2 -axis through an angle /.x 1 Ox 2 < 7 r would propel a right-handed screw 
toward the positive direction of the x 3 -axis. Conversely, if such a rotation 



Fig. 18.1. (a) Right-handed and (b) left-handed Cartesian coordinate systems 
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propels a left-handed screw in the positive direction of the x 3 -axis, the set of 
axes is said to be left-handed. In this section, we consider only Cartesian 
coordinate systems that are rectangular and right-handed. 

18.1.4 Rotation of Coordinate Axes 

We now formulate a rigid rotation of rectangular Cartesian axes. Assume a po- 
sition arrow r whose components are given by (sq, £ 2 , £ 3 ) and (x'i, x'2, x'3) 
in terms of two different rectangular coordinate systems having a common 
origin. We denote the set of unit orthogonal basis arrows associated with the 
unprimed and primed system by {e.^} and {e'j}, respectively. The transfor- 
mation from one Cartesian coordinate system to another is called a rigid 
rotation of Cartesian axes and has the following property: 

A rigid rotation of Cartesian axes: 

A rigid rotation of Cartesian axes is described by the transformation 
equations of coordinates Xk as 

x'j = Rjk Xk (summed over k) f (18.1) 

Xk — Rjk x'j (summed over j), (18-2) 

where Rjk = e'j ■ are directed cosines of e'j associated with e^. 


Remarks. 

1. Each of the two indices j and k for Rjk refers to a different basis: the first 
index j refers to the primed set {e'j}, while the second index fc refers to 
the unprimed one {ek}. This means that, in general, Rjk yf Rkj- 

2 . The transformation coefficients Rjk do not constitute a tensor, but simply 
set of real numbers. (See the second remark in Sect. 18.2.3.) 


Proof A geometric arrow r joining the origin O and the point P is expressed 

by 

r = Xkek = x'je'j. (18.3) 

We expand ek by the set of {e'j} as 

ek — (e j • ek) e j — Rjke. j, (18.4) 


where we use both the range and the summation conventions. Substituting 
(18.4) into (18.3), we obtain 
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and thus 

(* ^kRjk x j j e j 0 . 

Since the arrow set {e'j} is linearly independent, the quantities in the paren- 
theses equal zero, which results in the desired equation (18.1). 

Similarly, expanding {e'j} by {e^} as 

a j • e j ) e^, Rjkek (18.5) 

and substituting it into (18.3), we arrive at equation (18.2). X 

Remark. Observe that in the transformation law (18.1) and the expansion 
(18.5), Rjk acts on an unprimed entity (i.e., Xk or e*,) to produce a primed one 
(i.e. , x'j or e'j). However, in (18.2) and (18.4), Rjk acts on a primed entity to 
produce an unprimed one. In all the cases above, we should make sure that the 
order of indices j and k attached to the coefficients Rjk remains unchanged: 
the first index, j, always refers to the primed entity and the second, k, to the 
unrpimed one. 


18.1.5 Orthogonal Relations 

The following theorem states an important property of the transformation 
coefficients Rjk that gives rise to a rigid rotation of Cartesian axes. 

4 Orthogonal relations: 

The transformation coefficients Rjk for a rigid rotation of axes satisfy 
the conditions 

RikRjk — &iji R%k R'iJ! — dkt- (18.6) 


Proof The first relation of (18.6) follows from a geometric formula for the 
angle 6 between two basic arrows: e' and e'. Taking the inner product of the 
two basic arrows e[ = R^e-k and e ' = Rjeeg, we have 

cos 6 — e,; * ej — RjkRji)e.k * e@) — RikRjpdkC RikRjk' (18.7) 

If i = j, e' and e' coincide so that 9 = 0, whereas if i / j, e' and e' are 
orthogonal so that 9 = 7t/2. Hence, we have 

d p / l ^ ■' i j i 

ik jk ~ \ 0 if i ^ j. 

The second equation of (18.6) can be verified in a similar manner by con- 
sidering the angle between e*, and e^. X 

The physical meaning of the relations (18.6) is rather obvious. They ensure 
that the axes of each set {e'i} or {e*,} are mutually orthogonal, i.e., 

ej' e j — Sij and e/.- * e.{ 4/;;/: • 
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18.1.6 Matrix Representations 


Since the transformation coefficients Rjk have two subscripts, it is natural 
to display their values in matrix form. The notation [Rjk] is used to denote 
the matrix having Rjk as the element in the j th row and kth column. In 
addition, when denoted by R, it represents a linear operator of a rotation of 
axes without reference to any of the values of its coefficients Rjk- 


Example In two dimensions, a rigid rotation of rectangular axes for an angle 
9 is given by 


e y Gj , \Rij ] — 



cos 9 sin 9 
— sin 9 cos 9 


(18.8) 


This rotation of axes gives rise to a coordinate transformation, 


X i — Ri i X j 


or equivalently, 

/ / Rn R12 \ / afi \ 

ya^ J y R21 R22 J \X2 J 


( cos 9 sin 9 
— sin 9 cos 9 



(18.9) 


Remark. We comment briefly on the distinction between active and passive 
tranformations; since it often causes confusion. Throughout this chapter, we 
are concerned solely with passive transformations, for which physical entities 
of interest (e.g., the mass of a particle or a geometric arrow) remain unaltered 
and only the coordinate system is changed from {ei} to {e/}, as given by 
(18.8). In contrast, an active transformation alters the position and/or the 
direction of the physical entity itself, while the axes {ei} remain fixed. In 
the latter case, a rotation of a geometric arrow x through an angle 9 in two 
dimensions is described by 


e i = e.i and 



( cos 9 
sin 9 


— sin 9 
cos 9 



which obviously differs from those for a passive transformation. Figure 18.2 
illustrates the difference between the two transformations. 


It can be shown that the determinant of the matrix [Rke] reads 

det [Rkt] = ±1 (18.10) 

(see the proof in Exercise 5). This means that there are two classes of rectan- 
gular Cartesian coordinate systems, corresponding to the positive and neg- 
ative signs in (18.10). Throughout this section, we consider only cases of 
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(a) (b) 


Fig. 18.2. Difference between a passive (a) and an active (b) transformation 

det[l?fct] = + 1 , which corresponds to our previous restriction to a single type 
of coordinate system (i.e., right-handed). We shall see in sect. 18.3.1 that 
the rotation of coordinate axes whose transformation coefficients give rise to 
det[f?fc^] = — 1 yields the left-handed system, which for the moment is beyond 
our scope. 

18.1.7 Determinant of a Matrix 

We close this section by commenting on a formal definition of the determinant 
of a matrix and its relevant materials, as a preliminary for the exercises below. 

1. The determinant 

an ai2 • • • ai n 

D = det[ay]= a2 !° 22 "' a2n ( 18 . 11 ) 

^nl ®n2 * * * ^nn 

of the square array of n 2 numbers (elements) a t .j is the sum of the n! terms 

( 1) ^lfci^2fc2 ’ ’ ‘ (18.12) 

each corresponding to one of the n\ different ordered sets k\ , fc 2 , ■ ■ ■ ,k n 
obtained by r interchanges of elements from the ordered set 1 , 2, • • • , n. 

2. The minor (or complementary minor) A7 ? J of the elements a t j in the 
nth-order determinant D = detfa^] is the (n — l)th-order determinant 
obtained from (18.11) on erasing the ith row and the jth column. 

Example Given a third-order determinant 

an ai 2 a 13 
D = 021 022 023 , 

031 032 033 

its ai 2 minor is obtained by removing the first row and the second column 
in D , as expressed by 

M\2 = 


021 «23 
031 033 
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3. The cofactor Cy of the element a,j is dehned by 


Cij = 


d D 

dciij ’ 


or equivalently, 

Cij = (-D'-M/q, 


4. A determinant D may be represented in terms of the elements and cofac- 
tors of any one row or column by 


n n 

D = detfcqj] = dijCij = ajkCjk (with j fixed). (18.13) 

i = 1 k—1 


This is called simple Laplace development of a determinant D. (The 
proof is given in Exercise 2). The expression (18.13) gives the same value 
for D regardless of the column or row [i.e. , no matter what value of j in 
(18.13)] we choose in the expansion. Note also that for j ^ h , 

n n 

^ ' CLijCih — y ( Ojj^Chk — 0 * 

2=1 k—1 


Example The expansion by the first row gives 


D = 


13 0 
2 6 4 
-10 2 


= 1 


6 4 
0 2 


-3 


2 

-1 


2 6 
-1 0 


- 12 . 


Remark. In view of the expansion (18.13), an nth-order determinant D is rep- 
resented by a linear combination of n numbers of (n— l)tli-order determinants. 
Similarly, each of the latter (n — l)th-order determinants is in turn represented 
in terms of n — 1 numbers of (n — 2)th-order determinants, and so on. In a 
successive manner, we finally arrive at n! numbers of first-order determinants 
(i.e., just n! real numbers), each of which is expressed by (18.12). 


Exercises 

1. Check the validity of the orthogonal conditions (18.6) for the transforma- 
tion coefficients Rij in two dimensions. 

Solution: For instance, if we set j = 1 and k = 2, then 

RiiRa = R 11 R 12 + R 21 R 22 = — cos 9 sin 6 + cos 9 sin 8 = 0, 

or if j = k = 2, we have 

R% 2 Ri 2 = R 21 R 21 + R 22 R 22 = (— sin 0)“ + cos" 9=1. 

Other equations can be proved in a similar way. X 
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2. Show that the expression (18.13) for a determinant 

n n 

D = det[a P9 ] = a^Cik and D = auCu for fixed i 

k—1 i — 1 

yields the same value of D no matter what value of i we choose. 

Solution: We prove only the first formula, since the proof of the 
second is quite similar to that of the first. 

It easily follows that the statement is true for a second-order 
determinant for which the expansions with fixed i — 1, 011022 + 
012 ( — 021 ) and that with z = 2, 021 (— 012 ) + 022011 , give the same 
value. By mathematical induction, we tentatively assume that the 
statement is true for an (n — l)th-order determinant and try to 
prove that it is also true for an nth-order determinant. 

To do so, we expand D in terms of each of two arbitrary 
rows, say, the zth and the jth row with i < j, and compare the 
results. 

(i) Let us first expand D by its zth row. A typical term in the 
expansion by the zth row reads 

a ik C ik = a ik ■ (-1 ) i+k M zk , (18.14) 

where z is fixed and k runs from 1 to n. Since the minor M ik of 
in D is an (n — l)th-order determinant, owing to the induction 
hypothesis it can be expanded by any row. Expand Mj k by its 
(J — l)th row. This row corresponds to the jth row of D , because 
Mik does not contain elements of the zth row of D and i < j. 
Hence, the expansion of M ik by its ( j — l)th row consists of the 
linear combination of the elements aje with £ = 1 , c l i > • • , k— 1 , k+ 
1, • • • , n (i.e., i^k). We distinguish between the two cases, £ < k 
and i > k, as follows. 

For £ < k, the element belongs to the £th column of Mi k - 
Hence, the term involving ajt in the expansion of M ik reads 

ajc ■ (cofactor of ajt in M ik ) = ajt ■ (— 1 Mi k jf. (18.15) 

Here Mi k je is the minor of aje in Mi k , which is obtained from D 
by deleting the zth and jth rows and the fcth and £th columns 
of D. Then it follows from (18.14) and (18.15) that the resulting 
terms in the expansion of D are of the form 

aikaje ■ (—l) b Mikje with b = i + k + j + £—l. (18.16) 

If £ > k, the only difference is that aje belongs to the (l— l)th 
column of M lk because Mi k does not contain elements of the kth 
column of D and k < £. This results in an additional minus sign 
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in (18.15) and instead of (18.16) we obtain — a^ajn ■ (— 1 ) b+1 M ik j( 
with the same value of b. In short, the expansion of D by the ith 
row yields 


n n 

D = 'y ( O'ikCik — y ) Qik 

fc = 1 fc = 1 


1 ) b( ijtMikjt + ^ (— l) b+1 a,jtMikji 


U=i e=k + i 

(18.17) 


with b=i+k+j+£— 1. 

(ii) We next expand D by the jtlr row. A typical term in this 
expansion is 


ajtCjt = a je ■ (-1 ) j+i M je . (18.18) 


By the induction hypothesis, we may expand the minor Mjg of aji 
in D by its ztlr row, which corresponds to the ft li row of D since 

j > i- 

For k > £, the element in that row belongs to the (k — l)th 
column of Mje, because Mjt does not contain elements of the tth 
column of D and £ < k. Hence, the term involving aik in this 
expansion is 


a ik ■ (cofactor of a ik in M je ) = a ik ■ (— l)* +(fc (18.19) 

where the minor of a.^ in Mjg is obtained by deleting the 

ith and jtlr rows and the fcth and It h columns of D , and is thus 
identical to in (18.15). It follows from (18.18) and (18.19) 

that this yields a representation whose terms are identical to those 
given by (18.16) when t < k. 

For k < £, the element an- belongs to the fcth column of Mjg, 
so we get an additional minus sign and the result agrees with that 
characterized by (18.16). Hence, we conclude that the expansion of 
D by the j(> z)tlr row, Y^,k = l a jkCjk, is identical to the expansion 
(18.17). 

The conclusions from the discussions in (i) and (ii) clearly show 
that the two expansions of D consist of the same terms, which 
completes our proof of the given statement. 1ft 


3. Let bkj = Cjk/D , where Cj k is the cofactor of [ajk] in D = det[ayfc]. Show 
that 

bkjaek — bj f and bf-j aj p — 4/c/: ■ (18.20) 

which means that the matrix [6^] is the inverse of [o^]. 

Solution: If follows that 


bkjaek — 


Cj k agk 

D 


deiCji + anCji + • • • + ae n Cj 


jn 


D 


(18.21) 
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The discussion in Exercise 2 tells us that the sum in the numerator 
equals D when l = j regardless of the value of j. Hence, we have 

b k ja,£k = l if j = i- 

We next consider the case of j ^ £. To do thus we replace the 
elements in the jtlr row of D by those in the jjtli row of D. 
The resulting determinant, denoted by £>, reads 


a n 

a 12 • 

^1 n 

a j- 1,1 

a 3- 1,2 ’ 



ae, 2 • 

^£,n 

a j+ 1,1 

a j+ 1,2 ' 

' * ^j-\-l,n 

^n, 1 

Ora, 2 ’ 



since D has two identical rows. Note that the expansion of D by its 

n 

£th row is D = a,£ p Cj P , which equals the sum in the numerator 
p — i 

in (18.21). It thus follows that 

b k j (if k = 0 if j ^ l. 

These arguments complete the proof of the first equation. The 
second can be verified in the same manner. X 

4. Suppose a matrix [Qkj] defined by Q k j = Cj k /det[Rj k \, where the Rj k 
are the transformation coefficients of a rigid rotation of axes and the Cj k 
are the cofactors of [Rjk] in det[i?jfc]. Prove that Q k j = Rjk- 

Solution: Apply the result of Exercise 3 to find that QkjRek = <%• 
Multiplying both sides by Re m and summing with respect to t, we 
arrive at 

QkjRtkRtm ~ bjj Rfm — Rjm- 

Then, the orthogonal relation RekRcm = 8km implies that Qkj 8km — 
Qmj Rjm- X 

5. Show that det[i?,jfe] = ±1. 

Solution: It follows from (18.20) that 

det [QkjRje] = det[^] = 1, 

where the Q k j are the same quantities as in Exercise 4. From 
elementary linear algebra, we find that 
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det [QkjRji\ = det[Q kj ]det[R je ] = (det[-R^]) 2 , 

where the identity Q k j = Rj k was used to obtain the last term. 
Combining the two results above, we obtain det [7?,^] = ±1. A 


18.2 Cartesian Tensors 

18.2.1 Cartesian Vectors 

Having dealt with the rotation of coordinate axes, we are ready to introduce 
the concept of tensors and their transformation law in terms of Cartesian 
coordinate systems. Assume an ordered set of three quantities Vi (i = 1,2,3) 
that are explicit or implicit functions of Xj. Let us see how the values of 
Vi(xj) change through a rigid rotation of the Cartesian axes. If they transform 
according to the law given below, the quantities Vi are called the components 
of a particular kind of tensor, i.e., of a Cartesian vector (or a first-order 
Cartesian tensor). 


4 Cartesian vectors: 

A Cartesian vector v is an object represented by an ordered set of n 
functions Vi(xj) in terms of the ^-coordinate system and by another set of 
n functions v'i(x'j) in terms of the aZ-coordinate system, where v'i and Vi 
at each point are related by the transformation law: 

v i Vj and v i R k iV k , (18.22) 

where Rjj = e! \ ■ ej. 


Obviously, a vector v is a geometric object (like an arrow) so that it is uniquely 
determined independently of the coordinate system. On the other hand, the 
function form of the components Vi(xj) depend on our choice of coordinate 
system, even when we consider only the same vector v. This is why the con- 
cepts of a vector and its components are inherently different from each other. 
(See also Sect. 18.2.2 on this point.) 

We emphasize again that the index i of Rij in (18.22) refers to the dashed 
(transformed) function v'i, whereas the j refers to the undaslred (original) 
one Vj. In the following, we consider several examples of ordered sets of func- 
tions Vi in two dimensions, which may or may not be a first-order Cartesian 
tensor. 

Examples 1. The ordered set of functions ( Vi ) ( i = 1, 2) with the components 
v\ = X2 and V2 = —x\. 

Using the transformation law of coordinates x' t = RijXj, we set the fol- 
lowing for each function: 
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v'i = x'2 = R21X1 + R22X2 — — Xi sili 6 + X2 cos 9 , 

v r 2= — x'i = — R11X1 — R12X2 = — x\ cos 9 — X2 sin 9 . 

On the other hand, the functions v'i should be obtained from 1 \ through 
the transformation law as 

v'i = R\k v k = v i cos 9 + V2 sin 9 = X2 cos 9 — x\ sin#, 
v'2 = RikVk = — v± sin 9 + V2 cos 9 = — a^sinfl — aq cos61 

The two expressions for v' 1 and v'2 are identical to one another regardless 
of the values of 9 . Therefore, the pair of functions Vi(xj) are components 
of a Cartesian vector. 

2. The set Vi with V\ = X2 and V2 = X\. 

Following the same procedure as above, we have 

v'i = x'2 = — S£i + cx 2, 
v'2 = x'i = cx 1 + sx 2 

and 

f'l = C1>1 + SV2 = CX2 + SX1, 

v' 2 = —SVi+CV 2 = —SX2 + CX 1, 

where c and s represent cos 6 and sin0, respectively. These two sets of 
expressions do not agree with each other. Hence, the pair (X2, X\) is not 
a first-order Cartesian tensor. 

18.2.2 A Vector and a Geometric Arrow 

The result of Example 2 in Sect. 18.2.1 might be confusing for some readers; 
the functions Vi{x 1,^2) given there are not components of a vector, although 
they appear to represent a geometric arrow in (ar-a^-plane. To make this 
point clear, we have to comment on the difference between the formal defini- 
tion of a vector as a first-order tensor and our familiar definition of a vector 
as a geometric arrow. 

In elementary calculus, vectors are simply defined by a geometric arrow 
with certain length and direction, commonly denoted by a bold-face letter, 
say, v. Owing to this definition, v is uniquely determined by specifying its 
length and direction, which are both independent of our choice of coordinate 
systems. However, the uniqueness disappears if it is defined algebraically by 
specifying its components Vk relative to given coordinate axes. The values 
of the components Vk depend on our choice of coordinate system even when 
the same arrow v is considered. Hence, when we apply a coordinate transfor- 
mation, the values of Vk are altered in a way that preserves the length and 
direction of the arrow v, according to (18.22), which is why we call the set of 
n functions Vk not a vector, but the components of a vector. 

In short, we should always keep in mind that a vector is a geometric 
object independent of coordinate systems, whereas components of a vector 
are just mathematical representations of the vector with reference to a specific 
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coordinate system. This caution applies to all the classes of tensors presented 
throughout this section. 

Remark. Despite the above caution, we sometimes call a set of components of 
a tensor just a “tensor” to shorten our sentences. However, it is important to 
note an inherent difference between a tensor (=coordma,te-independent object) 
and components of a tensor (=coordinat e-dependent quantities). 


18.2.3 Cartesian Tensors 

We turn to a second-order Cartesian tensor that requires two subscripts 
to identify a particular element of the set. 

4 Second-order Cartesian tensor: 

A second-order Cartesian tensor T is an object represented by an ordered 
set of two-index quantities T.- t j in terms of the ^-coordinate system and by 
another set of quantities T' ki in terms of the ^'-coordinate system, where 
Tij and T' ki at each point are related by 

T'ij = RikRjtTkli Tu — RmkRniT 1 mn- (18.23) 

Here, the two-index quantities 'f,j are called components of the tensor 
T. 


In a similar way, we may define a Cartesian tensor of general order as follows: 
The set of expressions X)j...fc are the components of a Cartesian tensor if, for 
all rotations of the axes, the expressions using the new coordinates T' e m -. n 
are given by 

T — RicRjm * ’ ’ RknR^m-.-n 


and 


Tin 


= RptR, 


-pt 1 L qm 


■ RrnT' 


rn- 1 - pq-’-r • 


It is apparent that an ?rtlr-order Cartesian tensor in three dimensions has 3" 
components. 


Example Assume two Cartesian vectors a and 6, each of which is represented 
by the components aj and bk associated with the same coordinate system. 
Then, it is possible to create nine products of the components expressed by 


o-j bk (j,k = 1,2,3), 

which is called an outer product (or direct product) of the vectors a 
and b (see also Sect. 18.4.3). The outer product consists of a second-order 
Cartesian tensor. In fact, since each a, and bj transforms as 




18.2 Cartesian Tensors 


579 


we have 


a i Rj (!■[■ and b j Rj^bg, 

T ij = a xb j — RikRjgajcbg = It i /, Itp l / ■ 


Remark. We emphasize that transformation coefficients, say, the Rij, do not 
form a tensor and note the fact that the two indices i and j in the tensor 
Tij refer to the same coordinate system, whereas those in the coefficients 
Rij refer to different coordinate systems. Hence, T), and R jj are inherently 
different from each other, though both require two indices. 


18.2.4 Scalars 

Contrary to the case of finite-order tensors, we now consider quantities that 
are unchanged by a rotation of axes, which are called scalars or tensors of 
zero order and contain only one component. The most obvious example is 
the square of the distance of a point from the origin, which must be invariant 
under any rotation of coordinate axes. Other examples of scalars are presented 
below. 

Examples 1. We show that the scalar product u ■ v is invariant under 
rotation. In the original (unprimed) system, the scalar product is given 
in terms of components by UiVi and in the rotated (primed) system, it is 
given by 


c i v j (RijUj) ( Ejr v r ) ffq Rik Vj v ^ Uj ry 6 .j /,. UjVj , (18.24) 


where the orthogonal relation RijRjk = Sjk in (18.6) was used. Since 
the expression in the rotated system is the same as that in the original 
system, the scalar product is indeed invariant under rotations. 


2. If the Vi are the components of a vector, the divergence V • v = dvi/dxi 
becomes a scalar. This is proven as follows: In the rotated coordinate 
system, V • v is given by 


dv'j 

dx'i 


d 

dx'i 


(Rik'Vk') — Rik 


di’k 
dx'i ’ 


where the elements Rik = e-k • e',; are not functions of position. Using the 
relation dxj/dx'i = Rij (see Exercise 2 below), we have 


Rik 


dvk 

dx\ 


Rik 


dxj dvk 
dx'i dxj 


dv. 


dvk 


RikRii - ; — = Sjk o — = 


dx 


dxj 


dvj_ 

dxj 


Finally, we obtain 


dv' i dvj 



580 


18 Cartesian Tensors 


Exercises 


1. Examine whether or not the ordered set of functions i>j defined by v\ = 
(xi) 2 and i >2 = (x 2 ) 2 constitute a vector. Here (aq) 2 means the square of 

ab- 
solution: To examine the first function, V\, alone is sufficient to 
show that this pair is not a vector. Evaluating v'± directly gives 

v'i = ( x ’\ ) 2 = c 2 (a;i) 2 + 2 c(-s)a:ia ;2 + (-s) 2 ^) 2 , 

whereas (18.8) requires that v\ = ciq — sv 2 = c(xi) 2 — s(x 2 ) 2 , 
which is different from the above. X 


. OX i OXj 

2. Show that R, , = e * ■ e 7 - = — — = ——7- in Cartesian coordinate systems. 

OXj ox'i 

Solution: Set = XjGj and differentiate both sides with re- 

spect to x'j to obtain (dx 1 i/dxj)e' = ej. Taking the scalar prod- 
uct of both sides with e\ yields 


dx'i , , dx' \ dx'k 

"7: C i • & k Oik — , . 

OXj OXj OXj 


and Gj ' g k — Rkj • 


Hence, we have Rkj = dx'k/dxj. Similarly, if we differentiate by 
x'i (instead of Xj) the first identity yields Rkj = dxj/dx'k- X 


3. Show that the gradient of a vector u, denoted by Vv, is a second-order 
tensor. 

Solution: Suppose that Vi represents the components of a vector 
v and consider the quantities generated by Rj = dvi/dxj ( i,j = 
1,2,3). These nine quantities form the components of a second- 
order tensor, as can be seen from the fact that 


T' — 

± i] ~ 


dv\ 

dx'i 


d(RjkVk) dx t 
dxe dx'j 


= Rik-Q^-Rjt = RikRjtTki- X 


Remark. The concept (and its notation) Vu introduced above is not in the 
category of simple vector calculus. In fact, the quantity Vr is not a vector 
like V x v and V<(>, but a second-order tensor. 


18.3 Pseudotensors 

18.3.1 Improper Rotations 

So far our coordinate transformations have been restricted to rigid rotations 
described by an orthogonal matrix [Rij] with the property 
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|R| = det[i?.y] = +1. 


Such transformations are called proper rotations. We now broaden our dis- 
cussion to include transformations that are described by an orthogonal matrix 
[Rij\ for which 

|R| = — 1. 

The latter kind of transformations are called improper rotations (or rota- 
tion with reflection). Below are two examples of improper rotations. 

(a) Inversion 

The most obvious example of an improper rotation is an inversion of the 
coordinate axes through the origin represented by 

e' = — ei for all * = 1,2, 3. 

In this case, a position arrow x is described in terms of the bases e' and e,, 
respectively, by 


x = Xi.ei and x = x' i e' = x' i{— e,). 


Equating them, we obtain 


Xi — X i — Sij X j , 


which shows that an inversion of axes is expressed by Rij 
determinant becomes 


|R| 


-10 0 
0-10 
0 0-1 


= - 1 . 


— Sij. In fact, its 


(b) Reflection 

Another example is a reflection that reverses the direction of one basis: 

e'i = — e.j for a specified i. 

For the reflection of the x-axis, e.g., we have 


|R| 


-10 0 
0 10 

0 0 1 


= - 1 . 


Remark. Note that an inversion is different from a proper rotation only for 
an odd dimensionality. In a case of two or four dimensions, for instance, an 
inversion is the same as a proper rotation. In contrast, a reflection that changes 
the sign of only one coordinate is always different from a proper rotation 
regardless of the dimensionality. 
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Through an improper rotation, our initial right-handed coordinate 
system is changed into a left-handed one. This is illustrated schematically in 
Fig. 18.3. The reader should note that such a change cannot be accomplished 
by any kind of proper rotation. 


18.3.2 Pseudovectors 

Regardless of whether it is proper or improper, any rotation described by Rij 
transforms the components 'ty of a vector v as 

C i R-ij V j • 

This is because any real physical vector v may be considered as a geometrical 
object (i.e., an arrow in space), whose direction and magnitude cannot be 
altered merely by describing it in terms of a different coordinate system. 

It is, however, possible to define another type of vector w whose compo- 
nents Wi transform as 


w\ = RijWj under proper rotations, 
w'i = —RijWj under improper rotations, 


or equivalently, 


w’i = | R| .RijWj. 


In this case, the wy are no longer strictly the components of a true Cartesian 
vector. Rather, they are said to form the components of a pseudovector or 
a first-order Cartesian pseudotensor. A pseudovector may be alternatively 
referred to as an axial vector; correspondingly, a true vector may be called 

a polar vector. 



Fig. 18.3. Improper rotation: (b) inversion and (c) reflection of the right-handed 
Cartesian coordinate systsem depicted in (a) 
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Remark. A pseudovector should not be considered a real geometric arrow 
in space, since its direction is reversed by an improper transformation of 
the coordinate axes. This is illustrated in Fig. 18.4, where the pseudovec- 
tor w is shown as a broken line to indicate that it is not a real physical 
vector. 


Below is a summary of the discussion above. 

4 Vectors and pseudovectors: 

Components of a vector v transform as 

Vi — RijVj 

under a rigid rotation of Cartesian axes, whereas components Wi of a pseu- 
dovector w transform as 

W i — | R | Rij Wj , 

where |R| is the determinant of the transformation matrix [Rij\. 


Hence, the difference between a vector and a pseudovector manifests when 
applying an improper rotation that yields |R| = —1. 

Pseudovectors occur frequently in physics, although this fact is not usually 
pointed out explicitly. Following are physical examples of pseudovectors. 

Examples The following three physical quantities are all pseudovectors. 

1. Angular momentum of a moving particle, L = r x p, where r is the 
particle’s position arrow and p its moment vector. 

2. Torque on a particle, TV = r x F, where r is the particle’s position arrow 
and F the force acting on the particle. 

3. Magnetic field, B = V x A, defined by the rotation of the vector potential 
A. 


e 


3 


a w = w 3 e 3 


e 



e 2 


e 


3 


/ 

W 



= -w 3 e 3 


= -w 3 e 3 


= — w 


Fig. 18.4. Reversing behavior of the pseudovector w via the reflection of the e 2 -axis 
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It is noteworthy that each of these pseudovectors consists of a vector prod- 
uct of two vectors. 

18.3.3 Pseudotensors 

We can extend the notion of vectors and pseudovectors to objects with two 
or more subscripts. For instance, assume a quantity with components trans- 
forming as 

T\j = R ik Rj(T k t 

under proper rotations, but 


T'ij = —RikRjfTki 

under improper ones. Then, the T tJ are components of a second-order 
Cartesian pseudotensor. Similarly, Cartesian pseudotensors of arbitrary 
order are defined such that their components transform as 

-F ij-~k — |R | RiiRjm ' * * Rkn'Rlm - • -n 

where |R| is the determinant of the transformation matrix [Rij]. Correspond- 
ing to these, zeroth-order objects may also be divided into scalars and pseu- 
doscalars, the latter being invariant under rotation but changing sign on im- 
proper rotation. 

18.3.4 Levi— Civita Symbols 

A typical example of a third-order pseudotensor is the Levi— Civita symbol 

^ijk ■ 

Levi— Civita symbol: 

The Levi-Civita symbol (or the permutation symbol), denoted by 
Sijk , takes the values +1 and —1 if the ordered set i,j, k is obtained by an 
even or odd permutation, respectively, of the set 1,2,3. 

Actually, Sij k takes the values 


£123 — £231 — £312 — + 1 , 

£213 = £321 = £132 = — 1 , 

and £{jk = 0 if any two of the indices i, j, k are equal. 

The pseudotensor property of s^k follows from a convenient notation for 
the determinant |A| of a general 3x3 matrix [Ay] (see Exercise 2): 


'.mn — AieAjmAknSijk- 
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Certainly, this equation holds for the transformation matrix [Rji] for rigid 
rotation. Hence, we have 


Met mn — Rii Rj mRkn^ijk'i 


or equivalently, 

£ijk — | R| RilRjmRkn^i mn • (18.25) 

This shows that e^k is a third-order Cartesian tensor. 

The result (18.25) indicates more than the pseudotensorian character 
of Sijk- It clearly demonstrates that all of the components of e^k are un- 
altered by any rotation of axes. Tensors endowed with this property are 
called isotropic tensors (invariant tensors or fundamental tensors). 
We know that there are no isotropic tensors of first order and that the 
only ones of second and third order are scalar multiples of S,j and £jjk, re- 
spectively. Additionally, the most general isotropic tensor of fourth order is 
given by 


Adi&Amp T kp T V&ip&km . 


with arbitrary constants A, p, v. (Such a fourth-order isotropic tensor occurs 
in the elasticity theory of solids; see Sect. 18.5.4). All the isotropic tensors 
above are relevant to the description of the physical properties of an isotropic 
medium (i.e. , a medium having the same properties regardless of the way in 
which it is orientated). 


Exercises 

1. Show that an angular momentum L = r x p is a pseudovector. 

Solution: Since the position vector r and the momentum vector 
p are vectors, they transform under certain rotations of the axes 
(proper and improper) as r' = Rjkfk , p' m = RmnPn ■ Hence, the 
components of L in a new coordinate system read 

^ i ^ijk^jPk 

= (|R| JRipRjmRkn £ imn ) ( Rjq^q ) ( RksPs ) 

~ |R| Rit ( RjmRjq ) ( RknRks ) ^imn^qPs 

= | R | R>k fimq$ns ^fninXjjPq 

= R Rii ^ km/riP' (n pn — |R| Rii h/: , 


which clearly indicates that the quantities Lj form the components 
of a first-order Cartesian pseudotensor (i.e., a pseudovector). X 
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2. Determine whether |A | £(, mn = A^ A j rn A kn s j 7 - k holds for a general matrix 
[Aij] in three dimensions. 

Solution: Set l = 1, to = 2, n = 3, for instance, to find that the 
right-hand side reads 

Ai\Aj 2 Aks£ijk = A 11 A 22 A 33 + A 21 A 32 A 13 + A 31 A 12 A 23 

-A.nA32A.23 — A 2 iA 12 A 3 3 — A 31 A 2 2 A 13 = | A | . 
Other cases can be proven in the same manner. X 

3. Derive the identity: £ijk£kim = SuSjm - S im 5ji. 

Solution: We first note that the right-hand side of the above iden- 
tity. Si/ Sj rn fi'i/mAji , reads 

+ 1 if * = l and j = m / i, (18.26) 

— 1 if i = m and j = l ^ i, (18.27) 

0 otherwise. 

In the case of (18.26), the left-hand side of the desired identity is 

^ijk^klm — ^ijk^kij = i^-ijk) • (18.28) 

Since i ^ j, (18.28) takes the value +1 when k ^ i and k ^ j. 

As a result, we successfully obtain the desired identity. A similar 
procedure reveals that £ijk£kim = —1 hr the case of (18.27) and 0 
otherwise. X 

Remark. We should note that in (18.28), we have not summed with respect 
to i and j. This is because the second term in (18.28) was obtained by a 
substitution of particular values into the subscripts l and m, respectively. 


18.4 Tensor Algebra 

18.4.1 Addition and Subtraction 

We demonstrate below the bases of tensor algebra that provide ways of con- 
structing new tensors from old ones. For convenience, we may simply refer to 
Tij as the tensor, but it should always be remembered that the T.^ are the 
components of T in a specific coordinate system. 

The addition and subtraction of tensors are defined in an obvious fashion. 
If Aij... k and are (the components of) tensors of the same order, then 

their sum and differences, Sij...k and Dij... k respectively, are given by 


Bij-.-k Ajj...k T Bij-.-ki 
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Dij-.-k A ij---k B ij . . . k , 

for each set of values Furthermore, the linearity of a rotation of 

coordinates immediately yields 


RipSpq---r — Rip (Apq... r + R 
-A' ■ _i_ . 

— in ■■■r i i. 


pq---r) — RipApq.. 


RpiBpq.. 


= S’i 


18.4.2 Contraction 

Next is an operation peculiar to tensor algebra that is of considerable impor- 
tance in certain manipulations. 

Contraction: 

Contraction is an operation that makes two of the indices equal and 
sums over all values of the equalized indices. 


As an example, we consider a third-order tensor Tjj k whose transformation 
law is described by 


T ijk — R'ifi R'j rn R-krikkfrnn ■ (18.29) 

Now we perform a contraction of this tensor with respect to j and k. 
Setting j = k in (18.29) and summing over k, we get 


f ikk — RitRkmRknTimn — /7 , / 6 rn n I 'f : m: n — R r ('k{ ri 


where we used the orthogonality condition on the sum RkmRkn- The result 
indicates that the quantity Tikk forms the components of a tensor of order 
1 = 3 — 2. In general, contraction reduces the order of a tensor by two; 
contraction of an IVth-order tensor T,y...;... m ...fc by making the subscripts l and 
m equal produces another tensor of order N — 2. In particular, if contraction 
is applied to a tensor of order 2, the result is a scalar. 


18.4.3 Outer and Inner Products 

Let us consider the multiplication of tensors. For example, we may take two 
tensors A l3 and of different order and simply write them in juxtaposition: 

k’jj kJ:rn = AijBkim- (18.30) 

Then, the quantities are the components of a tensor of fifth-order, which 
follows immediately from the transformation law of tensors. Such a product 
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of (18.30), in which all the indices are different from one another, is called an 

outer product of tensors. 

Another kind of tensor, product of known as the inner product of tensors, 
is obtained from the outer product by contraction. For instance, putting j = k 
in (18.30) results in 

Cijjgrn — Bjfm , (18.31) 

which consists of a third-order tensor as demonstrated in Sect. 18.4.2. Then, 
the right-hand side of (18.31) is called an inner product of the components of 
the tensors A,j and Bktm- 

Examples The process of taking the scalar product of two vectors u and v , 
expressed by UiVi, can be recast into tensor language as forming the outer 
product 

Tij = UiVj 

and then contracting it to give 

— tii c • 

Using the concept of outer (and inner) product of tensors, we can write 
many familiar expressions of vector algebra as contracted tensors. For exam- 
ple, the vector product a = b x c has 

C/ — ^ijkbjCki 

as its itli component, where Eijk is the Levi-Civita symbol introduced in 
Sect. 18.3.4. This notation clarifies the distinction between the pseudovector 
consisting of the components EijkbjCk and the second-order tensor composed 
of the outer product biCj. 

Remark. The outer product of two vectors is often denoted without reference 
to any coordinate system as 


T = u® v. (18.32) 

This should not be confused with the vector product of two vectors, which is 
itself a pseudovector and is discussed in Sect. 18.3.2. The expression (18.32) 
gives the basis to which the components T;j of the second-order tensor refer: 
since u = Uiei and v = Vj ej , we may write the tensor T as 

T = Utei (g) vjej = UiVjU v = TijU ® v. 

Furthermore, we have 

T = Ui.ei ® Vjej = u'ie'i ® v'je'j, 

which indicates that the quantities T\j are the components of the same ten- 
sor T but referred to a different coordinate system. These concepts can be 
extended to higher-order tensors. 
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We show below several expressions of vector algebra as contracted Cartesian 
tensors: the notation [a]* indicates that one takes the ith component of the 
vector (or tensor) a. 

Examples 


1. a ■ b = aibi = Sijaibj. 


2. I o • (b x c)] ^ S ll a l [b x cj / i^£ij k bjC k ) £jj k c, bj c k . 

d 2 (/) 9 2 0 


3. V Z (A = 


dxidxi 1:1 dxidx-j 


4. [V x v] . = 




9 / (d'c, 


5. [V(V^)] i = ^(V. v) = — 

d 

6. [V X (V X ■u)] i = — [v x v] k = £ ijk £ k i 


d 2 i 


dxid x k 

d 2 V m 
1 dxidxi 


18.4.4 Symmetric and Antisymmetric Tensors 

The order of subscripts attached to a tensor is important; in general, T i3 is 
not the same as T ]t . But there are some cases of interest as described below. 

6 Symmetric and asymmetric tensor: 

If 

T- =T 

~ ± }ii 

holds for all i and j, the tensor composed of Tjj is called a symmetric 
tensor. Otherwise, if 

Tij = —Tji, (18.33) 

the tensor is said to be antisymmetric (or skew-symmetric). 


A tensor that is symmetric (or antisymmetric) in one coordinate system re- 
mains symmetric (or antisymmetric) in any other coordinate system. In fact, 
if Tij is symmetric in a given system, i.e., Tjj = Tjj, then 

T ij R j k Hj / 7 / RjgRi k Tg k I ) j , 

and similarly for antisymmetry and tensors of higher order. 

Notably, every tensor can be resolved into symmetric and antisymmetric 
parts by the identity 

T i j = S ij +A ij , (18.34) 
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where 

Sij = \i T ij + T j *) and A ij = \( T ij ~ T ji)- 

Evidently Sij is a symmetric tensor since it is unaltered even if i and j are 
interchanged. In contrast, A l3 is an antisymmetric tensor since the signs of 
all the components are reversed by exchanging i and j. Then, Sjj and A l3 are 
called the symmetric and antisymmetric parts of Tij, respectively. 


18.4.5 Equivalence of an Antisymmetric Second-Order Tensor 
to a Pseudovector 

It is noteworthy that in three dimensions, a second-order antisymmetric 
tensor W is associated with a pseudovector w. To see this, let the IV^ be 
components of an antisymmetric second-order tensor whose the transforma- 
tion law reads 

II ij RitRjmWem 

= RilRj2W\2 + R-il RjlWli + Ri2RjlW21 + Ri 2 Rj 3 W23 

+Ri 3 RjiWzi + Ri3Rj2Ws2, (18.35) 

since Wn = W 22 = W 33 = 0. Moreover, since Wi m = — W m £, we can reduce 
(18.35) to the form 


W' ^ ^ ( RuRjm — R im Rj e )W em , (18.36) 

(^,ra) 

where the sum restricts the values of (£, m) to (1,2), (2,3), or (3,1). 

Now we introduce the notation 

w\ = W23 — W32, W2 = W31 = W13, W3 = W12 = IT 21 , 


or more concisely, 


W n — Wf rn , 

where f, m, n is a cyclic permutation of the numbers 1, 2, 3, i.e. , 
(£,m,n) = (1, 2,3), (2,3, 1), (3, 1, 2). 

Then (18.36) can be written as 


(18.37) 


in which i,j, k and f, m, n are both cyclic permutations of 1, 2, 3. 

Noteworthy is the fact that (18.37) is equivalent to the transformation 
law of components Wk of a pseudovector w. After some algebra, we see that 
equation (18.37) can be reduced to a more compact form as 
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w' k = \R\R kn w n , (18.38) 

which is nothing but the transformation law of a pseudovector. [See Exercise 
2 for the proof of (18.38).] 

We have now arrived at the following theorem: 


4 Theorem: 

Assume a second-order antisymmetric tensor in three dimensions, whose 
components Wij take the form 

/ 0 W 12 -W 3 i\ 

[Wij] = -w 12 0 Was . 

V W 3 1 - w 23 0 J 

Then, the three components, W\ 2 , W 31 , and IT 2 3 can be associated with 
the pseudovector w whose components are given by 

(wi,w 2 , W3) = (W23, W31, W12) , 

or more concisely, 

Wi = ^ijkWjk- (18.39) 


The right-hand side of (18.39) is a twice-contracted product of the third-order 
pseudotensor, and second-order tensor, W,j ; hence, it is a pseudovector. 

Examples In physical applications, we often use the vector representation 
(18.39) of a second-order antisymmetric tensor. For instance, let us consider 
the equations of angular momentum of a moving particle with mass m. We 
assume that a force F acts on the particle located at x. Then, with i and j 
each taking the values 1, 2, 3 we get 

m{xjXk — XkXj) = FjXk — FkXj, (18.40) 

which gives us nine equations. Note that both sides of (18.40) are antisym- 
metric tensors. Among the nine equations, therefore, there are only three that 
are independent, (j, k) = (1, 2), (2, 3), (3, 1). So we can convert (18.40) into a 
more concise vector form as 


where we have defined 


mwi = Ni, 


u>i = £ijk(xjXk - x k xj) and N t = £ij k {fjX k - fkXj). 
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18.4.6 Quotient Theorem 

Sometimes it is necessary to clarify whether a set of functions, say, {ai(xj)}, 
forms the components of a vector or not. A direct method is to examine 
whether the functions satisfy a required transformation law under a rotation 
of axes, which is, however, troublesome in practice. In this subsection, we 
describe an alternative and more efficient method, called the quotient law, 
which is a simple indirect test for determining whether a given set of quantities 
forms the components of a tensor. 

4 Quotient theorem: 

If diVi is a scalar for a vector v in any rotated coordinate system, then 
the a,; constitute the components of a vector a. 


Proof Suppose that we are given a set of n quantities a, subject to the condi- 
tion that diVi is a scalar for components ry of arbitrary vector v in terms of 
an arbitrarily rotated coordinate system. We may then write 

a jVj = <j>, (18.41) 

in which <j> denotes a scalar. Denoting the (as yet unknown) transform of di 
by a' j, we know that in the ^'-coordinate system the condition (18.41) reads 

a'iv'i = f>. (18.42) 

Since (j> is a scalar, (f> = (f> r . Furthermore, since ty are components of a vector, 
it follows that 

e '/ RijVj. 

Accordingly, subtracting (18.42) from (18.41) gives 

( dj — a'iRij) Vj = 0. (18.43) 

On the left-hand side, a summation over j is implied, so, we cannot assert 
directly that the coefficients of Vj vanish. However, since (18.43) should be 
valid for any coordinate system, we may specifically choose the coordinate 
system in which the components of v read v\ = 1 and = 0. Equation 

(18.43) then reduces to 

a\ - Rna'i = 0. 

Similarly, choosing an appropriately rotated coordinate system that provides 
the components V 2 = 1 and v^ 2 ) = 0, we infer that 

d2 — Ri2d'i = 0. 

Continuing in this manner, we find that 
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cij = Rija' i for all j. 
Multiplying both sides by Rkj yields 

Rkjdj ~ RkjRijd i = ^ k i e i — d hi 


which is the transformation law for components of a vector. We thus conclude 
that the a,; constitute the components of a vector, denoted by a. ft 

Remark. In applications of the above theorem, one must be certain that the 
coordinate system employed is arbitrarily rotated, and this hypothesis repre- 
sents a very strict condition that is not often satisfied. 


18.4.7 Quotient Theorem for Two-Subscripted Quantities 

As a second important case, assume a set of n 2 quantities a t j such that aijViVj 
is a scalar <j> for a vector v and for any rotated coordinate system. Our task is 
to examine whether such two-subscripted quantities oy constitute the com- 
ponents of a tensor of second order. We shall see, however, that the answer is 
negative. In fact, we can say nothing about the tensorian character of from 
the hypothesis noted above, which implies the need to modify the quotient 
theorem for two-subscripted quantities. 

Developing the modified quotient theorem requires a discussion that par- 
allels that given in Sect. 18.4.5. By hypothesis, we can set 

CLijViVj — (f> 

in the given ^-coordinate system and similarly 

a'uv'kv'i = (j)' (18.44) 

in the ^'-coordinate system. In (18.44), we have denoted the as yet unknown 
transforms of aij by a'ij. Using the transformation law of Vi as well as the 
fact that </> = </>' gives us 

(ujj RkiRgjd k&) ViVj 0. (18.45) 

As a summation is implied over i and j, we cannot infer directly that 
the coefficients of ViVj vanish. Instead, we successively choose components 
(vi,V 2 ,V 3 , • • • ) as (1, 0, 0, • • • ) and (0, 1, 0, • • • ), etc., to get 

an — RkiRaa'ke = 0, a22 — RviRno' kf. = 0, • • • . (18.46) 

These results imply that the terms a ij with i = j obey the transformation 
law of second-order tensors. Nevertheless, it tells us nothing about the terms 
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involving a.jj with i ^ j . To further examine this point, we set components as 
V\ ^ 0, v 2 ^ 0, and 'ty = 0 for other i. Then, (18.45) becomes 

(a n - RkiRna'kt) vivi + (a i2 - RkiRna'kt) viv 2 
+ (a 2 i — RkiRna'kt) v 2 vi + ( a 22 — RkiRna' kt) v 2 v 2 = 0. 

Owing to (18.46), we find that the coefficients of V\V\ and v 2 v 2 vanish. Fur- 
thermore, since 

RkiRna'kt = RkiRna' tk 

is simply a relabeling of the indices k and £, we see that 

[(012 + a 2 \) — (i a! kt + a' e,k) RkiRn ] viv 2 = 0 . 

Thus, choosing v\ = 1 and v 2 = 1 gives us 

a\ 2 + a 2 \ = {a' u + a ik) RkiRn- 
Again, this process may be repeated to yield 

a%j T aji — (a T a pf- ) RkjRti , 


i.e., 


a kt A a RkjRti (ap’ -t- a.j j ) . 


This is indeed the transformation law of a second-order tensor, but it refers to 
a ij + aji, i.e., the symmetric part of 2 a^, and not to aij as such. Accordingly, 
the quotient theorem for this case must be stated as follows. 

4 Quotient theorem for two-subscripted quantities: 

Suppose a set of n 2 quantities a l3 to be such that for a vector v and for 
any rotated system, the sum aijViVj is a scalar. Then the symmetric parts 
(a l j + aji)/2 of are the components of a second-order tensor. 


Remark. 

1. If in addition to the above hypothesis, we are given that the a ij are sym- 
metric, then the a l3 themselves are the components of a second-order 
tensor. 

2. Nothing can be inferred about the tensorial character of the anti-symmetric 
part of from the above hypothesis that because part contributes noth- 
ing to the scalar (f>, as seen from 

( - aji)viVj = a^v^j - ajiViVj = a^Vj - atjVjVi = 0 , 

where in the last step the indices i and j are interchanged. 
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Example Using the quotient theorem, we show that the two-subscripted quan- 
tities a t j given by 




{x 2 ) 2 —X\X2 
-X\X2 (Xi) 2 


are the components of a second-order tensor. Note first that aij = and that 
the outer product x^x i is a second-order tensor. Contracting the quantities 
aij with the outer product XkXg, we obtain 


aijXiXj = (x 2 ) 2 (xi) 2 - x\x 2 xix 2 - x\x 2 x 2 xi + ( xi) 2 (x 2 ) 2 = 0, (18.47) 


in which the last term, 0, is a zeroth-order tensor. Since (18.47) holds for any 
rotated coordinate system, we conclude that a ij is a second-order tensor. 


Exercises 


1. Derive the equation Vx(Vx»)=V(V'»)- V 2 ix 
Solution: Straightforward calculations yield 

X (V X u)]i = Sijh^klm q ^ = 

d 2 Vj d 2 Vi d f dvj \ d 2 Vi 

dxjdxi dxjdxj da \dxj J dxjdxj 


= ^( v -)- y2 U = [V(v • t>)]< - [V 2 w]< 

= [V(v • v) - V 2 u]j. x 

2. Derive the expression (18.38) using the result (18.37). 

Solution: We consider the vector products (in the sense of elemen- 
tary vector calculus) of the transformed basis arrows e'j given by 
e\ x e'j = 

(Ruee) x (R jm e m ) = RuR jm et x e m . Forming the scalar prod- 
uct with e„ yields 

(6 i X 6 j) G n — RipRjm (e^ X e m ) * e n , 

where on the right-hand side only two terms survive for each fixed 
value of n since 

r +1 if (l,m,n) = (1,2, 3), (2, 3,1), (3, 1,2), 

(e e x e m ) ■ e n = < -1 if ( £,m,n ) = (2, 1, 3), (3, 2, 1), (1, 3, 2), 

[ 0 otherwise. 

(Here we assume that the coordinate systems associated with {e^} 
and {e'j} are both right-handed.) Hence, we have 
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(e { X 6 j) ■ G n — RimRj&i (18.48) 

where £, m, n is a cyclic permutation of 1,2, 3. Moreover, since 

6 i X 6 j = & k — Rkr^m (18.49) 

it follows from (18.48) and (18.49) that 

Rkr^r * — Rkr&rn — Rk n — RicRjm RimRjii 

where again i,j,k and £,m,n are both cyclic permutations of 
1,2,3. If {e'ij is left-handed, a similar procedure yields 

Rkn — ( R,f Rjm RimRji) • 

Substituting these results into (18.37), we finally arrive at the 
conclusion that 

w’k = |R| RknWn, 

which is a transformation law for a pseudovector. X 

3. Show that the process of contraction of an IVth-order tensor produces 
another tensor of order TV — 2. 

Solution: Let Tij...i... m ...k be the components of an TVth-order ten- 
sor; then 


T' 


■k — RipRjq ’ * * Rlr * * * Rms * * * RknRpq- ■ 


Thus if, e.g., we make the two subscripts l and m equal and sum 
over all the values of these subscripts, we obtain 


— RipRjq ' 


■ Rlr 


— RipRjq ' ' ' S r 


* Rms * * * RknRpq---r---s---r 
' RknRpq---r---s---n 


— RipRjq ’ * ’ Rkn'Rpq---r---r---m 


showing that are the components of a (different) Carte- 

sian tensor of order N — 2. A 


18.5 Applications in Physics and Engineering 

This section is devoted to illustrations of physical applications of second- and 
higher-order Cartesian tensors. We start with an example from mechanics and 
follow that by examples from electromagnetism and elasticity. 

18.5.1 Inertia Tensor 

Consider a collection of rigidly connected particles, wherein the ath particle 
has mass m ^ and is positioned at with respect to the origin O. Suppose 
that the rigid assembly is rotating about an axis through O with angular 
velocity w. The angular momentum J of the assembly is given by 



18.5 Applications in Physics and Engineering 597 



a 


Here p and = u> x for any a whose components are 
expressed in subscript form as 

(a) (a) ■ (a) , ■ (a) ( a ) 

p k = ml >x\' and x\. = SkimMiX^ 1 . 


Thus we obtain 


J * = EE £-ij k% j P EEE m^SijkX^EklmUlxffl 

ex. j,k ex j,k,l,m 

= YT™ M (6 U 6 jm - SimSj^X^X^OJl 
a 3,1 

, 2 


ex l 

with the definition 


E E m(a) { ria) ) 5 » 


- x^x^ 


Wi 




la - 


E m( °° (r(«)y 6 a 


(a) (a) 

— x) x) 


(18.51) 


The set of quantities In forms a symmetric second-order Cartesian tensor; the 
symmetric property expressed by In = In follows readily from (18.51). The 
fact that the /,; form tensors can be proved by applying the quotient rule (see 
Sect. 18.4.6) to equation (18.50), wherein ./, and u>i are vectors. The tensor 
In is called the inertia tensor of the assembly with respect to O. As evident 
from (18.51), In depends only on the distribution of mass in the assembly and 
not on the direction or magnitude of the angular velocity of the assembly, ix>. 

If a continuous rigid body is considered, is replaced by the mass 
distribution p{r ) and the summation ]T) q by the integral of f dV over the vol- 
ume of the whole body. When expanded in Cartesian coordinates, the inertia 
tensor of a continuous body would have the form 


I = [Iij] 


( Ry 2 + z2 )pdV — f xypdV — f zxpdV \ 
— f xypdV f ( z 2 + x 2 )pdV — f yzpdV 

y — f zxpdV — f yzpdV — f (x 2 + y 2 ) pdV ) 


The diagonal elements of this tensor are called the moments of inertia 
and the off-diagonal elements without the negative signs are known as the 

products of inertia. 

It is possible to show that the kinetic energy K of the rotating system 
is given by K = | Ijiu)jU>i , which is a scalar obtained by twice contracting 
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the vector uij with the inertia tensor hi- In fact, an argument parallel to that 
leading to (18.50) yields 

K = - V ■ r ^ 

2 ^ 

a. 

= 

0,1 

This shows that the kinetic energy of the rotating body can be expressed as 
a scalar obtained by twice contracting the vector uj 3 with the inertia tensor 
hi- Alternatively, since Jj = Ijiu>i, the kinetic energy may be written as 
K ~ 2’^i UJ o- 

18.5.2 Tensors in Electromagnetism in Solids 

Magnetic susceptibility and electric conductivity are also examples of 
physical quantities represented by second-order tensors. For the former, we 
have the standard expression 


- £ 


m 


(<*) 


E 


(a) 

£ilr, 


MX, 


(a) 


\1, - (18.52) 

3 

where M is the magnetic moment per unit volume and H is the magnetic 
field. Similarly, for the case of electric conductivity, we can write 

ji = YaijEj. (18.53) 

3 

Here, the current density j (current per unit perpendicular area) is related to 
the electric field E. In both cases, we have a vector on the left-hand side and 
the contraction of a second-order tensor with another vector on the right-hand 
side. 

For isotropic media, the vector M is parallel to H and, similarly, the 
vector j is parallel to E. Thus, the above tensors satisfy \ij = X^ij an d 
<Jij = crSij, respectively, resulting in M = and j = aE. However, for 
anistropic materials such as crystals, the magnetic susceptibility and electric 
conductivity may be different along different crystal axes, thus making \ij 
and Uij general second-order tensors (usually symmetric). 


18.5.3 Electromagnetic Field Tensor 

All the tensors that we have considered in this chapter so far relate to the three 
dimensions of space and they are defined as having a certain transformation 
property under spatial rotations. In this subsection, we shall have the occasion 
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to use a tensor in the four dimensions of relativistic space-time; the tensor is 
the electromagnetic field tensor F^ u . 

Recall that an electromagnetic field in free space is governed by the 
Maxwell equations, which take the form 


V • B = 0, V • E = 4nkip, 

VxB = 4fW+^, VxE=-k 3 ^. 

h\ ot ot 

Here E is the electric field intensity, B is the magnetic induction, p is the 
charge density, and J is the current density. There are several ways of defining 
the values of constants hi (i = 1, 2, 3); indeed, their values depends on which 
system of unit we use. Typical examples are listed in Table 18.1. 

The Maxwell equations take on a particularly simple and elegant form on 
introducing the electromagnetic field tensor F ttv defined as 

F„ v = d„A M - d»A v . (18.54) 

Here, A ^ = (<f>/c, —A) is called a four potential, determined by the scalar 
potential (j) and the vector potential A that generate the fields B = V x A and 
E = — S7 <j> — dA/dt. The symbol d in (18.54) denotes the partial derivatives 
with respect to the /jth coordinate. Straightforward calculations yield 


[F^\ = 


0 E 1 /c E 2 /c E 3 /c 
—E 1 /c 0 

-E 2 /c B 3 


~B 3 

0 

B 1 


B 2 
—B 1 
0 


(18.55) 


-E 3 /c - B 2 

where E = (If 1 , if 2 , ill 3 ) and B = (B 1 , B 2 , B 3 ). We also introduce another 
relevant tensor defined by 


0 

-E 1 /c 

—E 2 /c 

—E 3 /c~ 

E 1 /c 

0 

-B 3 

B 2 

E 2 /c 

B 3 

0 

-B 1 

E 3 /c 

—B 2 

B 1 

0 


(18.56) 


in which p and v are superscripts in opposed to (18.55), where they are 
subscripts. As a result, we can see that the Maxwell equations are equivalent 


Table 18.1. Values of the constants ki(i = 1, 2, 3) in the Maxwell equations, po, Eo 
and c are the permeability, permittivity, and speed of light in vacuum, respectively 


System of Unit 

fci 

fc 2 

fea 

MKSA 

1/(47T£ 0 ) 

Mo/(4tt) 

1 

CGS-esu 

1 

1/c 2 

1 

CGS-emu 

c 2 

1 

1 

CGS-Gauss 

1 

1/c 

1/c 
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to the following two field equations: 

Y^duF^^VL 0 f, 

d a F^iu + d^F va + d u F afJ/ = 0 , 

where j 1 ' = (pc, J) is the four-current density. 

I Remark. The distinction between superscripts and subscripts on the symbol 
F shown in (18.55) and (18.56), respectively, is clarified in Chap. 19, which 
deals with non-Cartesian tensor calculus. 


18.5.4 Elastic Tensor 

Thus so far, we have focused on the physical applications of second-order 
tensors, which relate two vectors. Now, we extend this idea to a situation 
where a fourth-order tensor relates two physical second-order tensors. Such 
relationships commonly occur in elasticity theory. In the framework of this 
theory, the local deformation of an elastic body at any interior point P is 
described by a second-order symmetric tensor eij called the strain tensor, 
which is given by 

1 / du.i duj\ 

2 \ctry dxi J ’ 

where u is the displacement vector describing the strain of a small volume 
element. Similarly, we can describe the stress in the body at P by a second- 
order symmetric stress tensor pij ; the quantity pij is the a; -component of 
the stress vector acting across a plane through P, whose normal lies in the ay- 
direction. A generalization of Hooke’s law that relates the stress and strain 
tensors is 

Pij = ^2 C ijkl e kl i (18.57) 

k,l 

where Cijki is a fourth-order Cartesian tensor. 

Specifically, for an isotropic medium, we must have an isotropic tensor for 
Cijki ; the most general fourth-order isotropic tensor is 

Cijki — XSjj Ski T rj&ik&ji T rdnJ) jk • 

Substituting this into (18.57) yields 

Pij — \6ij ^ ^ &kk T V^ij T rCjj . (18.58) 

k 

Note that eq- is symmetric. Hence, if we write ?? + v = 2/x, (18.58) takes the 
conventional form 

Pij X ^ ( e<kk, X/ j T 2p,eij, 

k 

in which A and p, are known as Lame constants. 
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Abstract Having discussed tensor theory based on Cartesian coordinates, we now 
move on to its counterpart, i.e., tensors described by curvilinear coordinate systems. 
The use of a curvilinear coordinate system endows the tensor calculus with the 
properties of “covariance” (Sect. 19.1.3) and “contravariance” (Sect. 19.1.4), both 
of which are new concepts originating from the nonorthogonality of the coordinate 
axes. 


19.1 Curvilinear Coordinate Systems 

19.1.1 Local Basis Vectors 

We have thus far restricted our attention to the study of Cartesian ten- 
sors, where, from a practical stand point, only rigid rotations of axes (proper 
and/or improper) are taken into account as coordinate tranformations. How- 
ever, we must free ourselves from this restriction and develop the tensor calcu- 
lus in terms of curvilinear coordinate systems, In advanced mathematical 
physics, we often have to deal with tensor analysis on curved surfaces (or more 
abstract manifolds) on which orthonormal coordinate systems cannot be de- 
fined, and in such cases the theory developed thus far is entirely inadequate. 
This means that we have to formulate tensors and their transformations in 
terms of general curvilinear coordinate systems. 

To begin with, we review some properties of general curvilinear coordi- 
nates. Suppose that the position of an arbitrary point P in a three-dimensional 
space has Cartesian coordinates x, y, z. In general, this position may be ex- 
pressed in terms of three curvilinear coordinates ui,u 2 , U3, which are functions 
of x,y,z as explicitly represented by 


Mi = ui(x,y,z), 
u 2 = u 2 (x,y,z), 
U3 = u 3 (x,y,z). 
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We denote by r the position arrow connecting the origin O and the point P. 
Obviously, the direction and magnitude of the arrow depend on the coordi- 
nates of P, which are symbolized by 

r = r(u!,U 2 ,us). 


We now consider the partial derivative of r with respect to Ui , i.e., 

dr 

e-i = w— • (19.1) 

OUi 

From the definition, the vectors e t are directed along the corresponding coor- 
dinate lines at the point P. As a result, an infinitesimal vector displacement 
dr in curvilinear coordinates is given by 

dr = — — dui = eidui , 

OUi 


where the summation convention is employed. The vectors e, are referred to 
as local basis vectors. (In precise terminology, they are called covariant 
local basis vectors, as explained later.) 

It is obvious from (19.1) that the vectors e, are functions of the curvilinear 
coordinates u*, namely, e t = e.i(u\, 112 , 113 ). This implies that the directions 
and magnitudes of the e,; vary from point to point in the space considered, 
which is in contrast to the case of a Cartesian coordinate system, where the 
basis vectors are spatially independent. Spatial dependence of basis vectors 
is actually one of the most important properties of curvilinear coordinate 
systems. 

Another notable property of curvilinear coordinate systems is the fact that 
they allow us to define another useful set of three vectors at P as 


S i — V Ui . 


Clearly the direction of £,; is normal to the surface itj = const; thus being 
different from the directions of any vectors e. t (i = 1,2,3) in general (see 
Fig. 19.1). Therefore, at each point P in a curvilinear coordinate system, 
there exist two sets of basis vectors defined by 



Fig. 19.1. (a) Spatial dependence of e; in the curvilinear coordinate system, (b) 
Difference between e; and e. 
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dr 

e.j = — — and = Vw*. (19.2) 

dui 

In the tensor analysis, literature of the set of vectors £; introduced above is 
denoted by e l , the index being placed as a superscript to distinguish it from the 
first set of vectors e,. Relating to the notation above, we introduce a modified 
summation convention as follows: if we find a lower-case alphabetic index 
that appears twice, once as a subscript and once as a superscript, we sum 
over all the values that the index can take. In this convention, the curvilinear 
coordinates are denoted by v},u 2 ,u : 3 , with the index raised (see the remark 
in Sect. 19.1.3), to arrive at the following definition. 

6 Local basis vectors: 

A curvilinear coordinate system is characterized by two sets of three 
vectors {e;} and {e*} defined by 

dr 

e, ; = 7 — and e* = Vw*. 
du l 

Here, the e, are referred to as the covariant local basis vectors, and the 
e* as the contravariant local basis vectors. 

The prefix “local” emphasizes the fact that the lengths and orientations of 
these basis vectors vary from point to point in the space; this fact is explicitly 
represented by 

e* = e*(tr,u , w 3 ) and e J = e j (u 1 , u 2 , u 3 ) . 

For the sake of conciseness, we omit the prefix in the subsequent discussions 
and use the terms contravariant (or covariant) basis vectors, bearing the 
locality in mind. 

Remark. 

1. In common practice indices that represent contravariant character are 
placed as superscripts and those indicating covariant character as sub- 
scripts. 

2. For Cartesian coordinate systems, the two sets of basis vectors and e l 
are identical and, hence, there is no need to differentiate between con- 
travariance and covariance. 

3. In derivatives such as dr/du l , the i is considered as a subscript. 


19.1.2 Reciprocity Relations 

Generally the covariant basis vectors ei, e%, and e 3 are neither of unit length 
nor are they orthogonal to each other; this is also true for the contravariant 
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basis vectors, e 1 , e 2 , and e 3 . Nevertheless, the sets {e^ and {e J } still have 
an important property as stated below. 


4 Reciprocity relations: 

The sets of contravariant and covariant local basis vectors {e^} and 
{ej} satisfy the reciprocity relations such that 

ei-e J =6\, (19.3) 

where the scalar product of the vectors is taken in the sense of elementary 
vector calculus. 


Proof By using Cartesian representation, we have 


e* • e J = 


dr 

du * 




dx 

dy 

dz \ 

( du j 

dui 

du j \ 

du 1 

1 du 1 

’ du 1 ) 

' V dx ' 

dy ‘ 

dz ) 


dx dui dy du J dz du ? 

du 1 dx du 1 dy + du 1 dz 


du j 

du 1 


si- * 


Remark. The reciprocity relation (19.3) implies that each covariant (or con- 
travariant) basis vector ej (or e‘) is perpendicular to all contravariant (or 
covariant) basis vectors (or e k ) except k = i. For instance, e\ is perpen- 
dicular to e 2 and e 3 , but not to e 1 in general. To be precise, the vectors e\ 
and e 1 make an angle 9 that satisfies 

ei • e 1 = |ei | je 1 1 cos 9 = 1, 


where |ei| ^ 1 and | e 1 1 ^ 1. 


19.1.3 Transformation Law of Covariant Basis Vectors 

We are now in a position to discuss the concept of general transformations 
from one coordinate system, vf , to another, u' , u' . u' . A coordinate 

transformation is described by using the three equations 

u' 1 = u'V,« 2 > w3 )> 


(19.4) 
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for i = 1,2,3, in which the new coordinates u ' 1 can be arbitrary functions of 
the old ones u l . We assume that the transformation can be inverted, so that 
we can write the old coordinates in terms of the new ones as 

U = U (u ,U ,U ). 


We now formulate the transformation law of basis vectors. The two sets 
of basis vectors in the new coordinate system are given by 

f)rr* 

e' = r and e ,! = V«'\ (19.5) 

1 du' 1 K ’ 

Using the chain rule, we find that the first set of basis vectors yields 


dr dui dui 
dvJdvT 1- dT? ej ' 


(19.6) 


This describes the transformation behavior of the local covariant basis vectors 
from the unprimed one ej to the primed one e! \ under the coordinate trans- 
formation (19.4). Note that the partial derivatives as well as the basis vectors 
in (19.6) vary from point to point. Hence, relation (19.6) is valid under the 
condition that all terms involved are evaluated at the same point P in the 
space being considered. 

In the same manner, it follows that 


dr dr du ' 1 du' e , 

k dui du ,e du k du k 6 1 

We thus have proved the following theorem: 

4 Transformation law of covariant basis vectors: 

The sets of local covariant basis vectors {ej(u J )} and {e' k{u^)} asso- 
ciated with two different curvilinear coordinate systems are related at a 
point P by 

did dv ,( 

e '*=Wi e * and e *=W e ' e ' (19J) 

in which the partial derivatives are to be evaluated at P. 


Remark. Observe that in all the mathematical expressions above (and be- 
low), the summation convention is applied to the indices that are repeated 
in one term as both a subscript and a superscript. Indeed, it was to satisfy 
this summation convention that the coordinates were written as u z rather 
than Ui- 




606 


19 Non-Cartesian Tensors 


19.1.4 Transformation Law of Contravariant Basis Vectors 


Next we consider the transformation law of the contravariant basis vectors 
e l = X7u l . Recall that in terms of a rectangular Cartesian coordinate system, 
the operator V is expressed as 


V = 


. d 
l dx 



+ k 


d_ 

ai* 


where are mutually orthogonal basis vectors of unit length. It then, 

follows that 


= Vu = i 


du h 


.du' 


dx dy 
in which the first partial derivative reads 


du' K 


du' k du 1 du' k 

dx dx du * 

and other derivatives are written in the same way. Hence, we have 


.du 


. du 1 


du l \ du' ‘ 


= ^ _ = (Y7 v i\ 

l 1 ^ dy + ^ ' VW ' 


dx 

Similarly, we have 


du‘ 


e j = \7u j = 


.duT 

dx 


dz J du* 




du' t 
— = —€/ . 

du 1 du 1 


.du 

] ~dy 




du 

dz 


did 

~d^ 


These results are summarized as follows: 


4 Transformation law of contravariant basis vectors: 

The two sets of local contravariant basis vectors {e l (u J ')} and {e' k (u' f )} 
are related at a point P by 


du 1 


,k du' k i t 

e = „ . e and e = e ' 

du 1 du' 3 


(19.8) 


where the partial derivatives are again to be evaluated at P. 


It should be emphasized again that, owing to the summation convention, the 
repeated indices in (19.8) appear once as a superscript and once as a subscript 


19.1.5 Components of a Vector 

Given the two bases and e®, we may express a general geometric arrow a 
(i.e. , a vector a) equally well in terms of either basis as follows: 
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a = a 1 e 1 + a 2 e 2 + a 3 e 3 = a*e.j, 
a = ciie 1 + a 2 e 2 + a 3 e 3 = a.je 4 . 


The a* are called the contravariant components of the vector a and the a* 
the covariant components. Both kinds of components a 1 and ctj describe 
the same vector a, but they are associated with different basis vectors e, and 
e J , respectively. In plain words, a vector assigned at a point in a curvilinear co- 
ordinate system has two different expressions; say, (a 1 , a 2 , a 3 ) and ( 01 , 02 , 03 ) 
for the same vector a. The tensorian characters of the two kinds of compo- 
nents are inherently different from each other, as we shall see in subsequent 
discussions. 

For any vector a, the two kinds of components a 1 and a.j are readily ob- 
tained by forming the scalar products, 


a ■ e' = a^ej ■ e l = o J <$*• = a 1 

and 


a ■ e j = a J e 3 ■ e t = aj 8\ = Oj , 

where we have used the reciprocity relation (19.3). Furthermore, using the 
transformation law of e, given in (19.7) gives us 


/ i / 7 7' 

a = a e- = a J e. = a J 


du' x 

du'i 


(19.9) 


This provides the transformation law of the contravariant components of a 


vector such that 


,i du' 1 
a =d^ a - 


(19.10) 


This relation is, in fact, the defining property for a set of quantities a 1 to 
form the contravariant components of a vector. The formal statement is given 
below. 


4 Contravariant component of a vector: 

Quantities a* associated with a point P are said to be the contravariant 
components of a vector if these, quantities transform through the equation 


,i du' 1 • 
a = ^ a ’ 


where the partial derivatives are evaluated at P. 


(19.11) 


Remark. It might occur that a given ordered set of quantities a k associated 
with a point P has nothing to do with a vector; only those sets satisfying the 
transformation law (19.11) serve as (contravariant) components of a vector. 
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Analogously to the case of (19.9), it follows from the identity for an arbi- 
trary vector a, 


/ t'l 7 

a = a ,e = a^e J = a. 


du 3 ,i 


that the transformation law of covariant components yields 


a, = 


dui 
du 7 


(19.12) 


Again we take this result as the defining property of the covariant components 
of a vector. 


4 Covariant components of a vector: 

Quantities a,; associated with a point P are said to be the covariant 
components of a vector if those quantities transform through the equation 

du 1 

a'k = (19.13) 

where the partial derivatives are evaluated at P. 


Remark. Other textbooks may use the expression “contravariant (or co- 
variant) vector,” which is a distinctly different concept from a vector a 
or its components a 1 (or a*) that we have just defined. Say, rather, that a 
contravariant vector is a collection of ordered triples, 

f / 1 2 3\ f /I /2 /3\ / //I //2 //3\ 1 

| (a , a , a °) , [a , a , a J , (a , a , a J , • • • j , 

in which all the ordered triples consist of contravariant components of the 
same vector a associated with different coordinate systems. We should make 
sure that a contravariant (or covariant) vector is not expressed by a geometric 
arrow as is done for a vector. 


19.1.6 Components of a Tensor 

We now define geometric objects of the contravariant class, which are more 
complicated in character than vectors and begin with the following: 

4 Contravariant component of a tensor: 

Index quantities Tjk associated with a point P are said to be contravari- 
ant components of a tensor if these quantities transform according to the 
equation 
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T ,jk = du' k 

du e du m 


(19.14) 


There is no difficulty in defining covariant tensors of higher orders. For a 
tensor of second order, e.g., we have the definition below. 


4 Covariant components of a tensor: 

Index quantities Tjk are said to be covariant components of a second- 
order tensor if these quantities transform according to the equation 


T'jk = Tt 


m 


du l du m 
du'-i du' k 


(19.15) 


We shall see later that there are many examples of tensors of this kind in 
physics and engineering. The moment of inertia, the stress of elasticity, and the 
electromagnetic field are cases in point; if their components in terms of certain 
coordinate systems are evaluated, they all turn out to obey the transformation 
law (19.14).' 

In terminology, all quantities satisfying (19.11), (19.13) and (19.14), (19.15) 
are called components of a first-order tensor and components of a 
second-order tensor, respectively; the order goes as the number of in- 
dices attached. The definitions of tensors of higher orders are given through a 
straightforward generalization of the above. Conversely, we can define a ten- 
sor of zero order, called a scalar, that involves no index so that its single 
component (i.e., the scalar itself) is constant under any coordinate transfor- 
mation; namely, 

T' = T. 

Such a quantity is called an invariant. 

Remark. For any components of tensors, the number of indices is independent 
of the number of dimensions of the space considered. The definitions above 
for vectors, tensors, and scalars are all valid for an arbitrary n-dimensional 
space. 


19.1.7 Mixed Components of a Tensor 

Having defined contravariant and covariant components of a tensor, we can 
now define another class of components, called mixed components of a 
tensor, that involve the two character simulteneously. 
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4 Mixed components of a tensor: 

Index quantities T l - k are said to be the mixed components of a tensor 
of the third order if these quantities transform according to the equation 


rpf * rp£ 

^ jk -*■ ran 


dv! 1 du m du n 


Clearly, T *. k transforms contravariantly with respect to the first index i but 
covariantly with respect to the other indices j and k. 

If we consider the components of lriglrer-order tensors in non-Cartesian 
coordinates, there are even more possibilities. As an example, let us consider 
a second-order tensor T. Using the outer product notation, we may write T 
in three different ways: 


T = T'-'e, 0 e, = C'c' <8> e, = 7',,c' 0 e j , 


where T lJ , 7’’ . and Tjj are called the contravariant, mixed, and covari- 
ant components of T, respectively. It is important to remember that these 
three sets of quantities form the components of the same tensor T but refer 
to different tensor bases made up from the basis vectors of the coordinate 
system. Again, if we use Cartesian coordinates, the three sets of components 
are identical. 

We may generalize the above equation to higher-order components. An 
object T a '^ s is called a component of type (n, to) in which the integers n 
and to represent the numbers of superscripts and subscripts, respectively. 
By definition, components carrying only superscripts (i.e., to = 0) or those 
carrying only subscripts (i.e., n = 0) are referred to as the contravariant and 
covariant components, respectively; all others are called mixed components. 


Remark. The order of indices needs caution. For instance, we shall see later 
that in general 

/-pi //pi 

1 3 ' J 3 ' 

Nevertheless, we can write Tj with no clarification of the order of i and j if 
no ambiguity occurs or the order of indices is irrelevant. 


19.1.8 Kronecker Delta 


The Kronecker delta is a special kind of a second-order tensor that has 
mixed components given by <5), and is defined as follows: 



1 (i = j), 
0 
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As these are mixed components of a tensor, they transform as 

i _ du" du m t _ du" du l _ du " _ f 1 , if i = j, 

j ~ du 1 du ,j m ~ du e du' j - du'i “ \ 0, if i ± j, 

since in the last partial derivative, u' 1 and u'° are independent coordinates. 
Thus, we obtain the result 

S') = 5), (19.16) 

which means that the tensor consisting of <5* has the same components in 
all coordinate systems. This is why the tensor consisting of Sj is called the 

fundamental mixed tensor. 

Remark. The components S lJ (or Sij) are of no special importance, since they 
do not satisfy the invariance condition (19.16), which means that their values 

change when we use other coordinate systems. A exception is the case of 

rectangular coordinate systems, where the contravariant and covariant tensors 
become identical, so that we have = 6j = 6^ . 


19.2 Metric Tensor 

19.2.1 Definition 

We now introduce important quantities that describe the geometric character 
of the space arithmetized by a certain curvilinear coordinate system. We know 
that the scalar product of a vector a and local basis vectors e, ; and e J yields 

aj = a ■ ej = a 1 (e* • ej) and ad = a • e J = ai (e* • e J ) . (19.17) 

Now we introduce the following notation: 

C'/ * Gj fjij (Jji 

and 

e i ■ e' = !,' J = u " . 

We can then write (19.17) in the form 

a j = gjka k and a J = g^ak- 

These equations express the covariant components of the vector a in terms 
of its contravariant components, and vice versa. We shall see that the nine 
quantities gik form a second-order tensor called a metric tensor. 
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4 Metric tensor: 

Two-index quantities defined by 

gij = e-( • e,j and g kt = e k ■ e £ (19.18) 

serve as covariant and contravariant components of a second-order tensor 
called a metric tensor. 


The proof of the tensor character for the above is given in Exercise 1 . 


Remark. 

1. Since both e.j and are functions of the coordinates, so are the quantities 
9ij and g lJ . 

2. The mixed components g * of the metric tensor are identical to those of 
5j since, by definition, we have 



Examples We calculate the elements g ij for cylindrical coordinates, where 
(u 1 , u 2 , u 3 ) = (p, (f>, z ) and p and </> are related to Cartesian coordinates x and 
y as x = p cos 4> and y = p sin </>. Hence, the position vector r of any point 
may be written as 

r = p cos (j)i + psin <pj + zk , 

where i,j,k are orthogonal basis vectors. By definition, we have 


Or . 

— — = cos mt + sin mj , 
dp 

(19.19) 

dr . 

— = — psm (j)t + p cos 
d(p 

(19.20) 

dr 

is = k - 

(19.21) 


Thus the components of the metric tensor [gij] = [e.i ■ ej\ are found to be 


bij] 


/I 0 0 \ 

0 p 2 0 

\o 0 l) 


19.2.2 Geometric Role of Metric Tensors 

The quantities gik (or g lk ) describe the fundamental geometric character of 
a space aritlrmetrizecl by a certain rd-coordinate system with a basis {ei}. 
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A geometric role for g^j was implied in the definition (19.18), where equals 
the scalar product of the two covariant local basis vectors e t and e :) . Hence, 
gij determines the angles of local basis vectors e, and e ? - at each point and 
thus describes the coordinate(w fc )-dependence of the vectors e, ; = e.i(u k ) and 
e.j = ej(u k ) that span the space being considered. This implies the possibility 
that the metric tensor g rather than the basis vectors can be regarded as a 
more fundamental object determining the geometric nature of the space in 
question. Indeed, we can establish the framework of tensor calculus based on 
a knowledge of the spatial dependence of the metric tensor g without any 
information about the local basis vectors. This point is dealt with in §20.3.5 
Sect. 20.3.5. 

The role of g l:j in determining the geometric nature of the space also follows 
from another stand point as shown below. Let ds be the arc length between 
two infinitely close points. We denote by dr the vector joining the two points, 
whose covariant components are dui and contravariant components du l . Then, 
since dr = eglu 1 = e k du k , we have 

(ds) 2 = \dr \ 2 = dr ■ dr 

= e.idu 1 ■ e k du k = eidu 1 ■ e k du k = e l dui ■ e k du k , 


or 


(ds) 2 = gikdu l du k , (19.22) 

(ds) 2 = g lk duiduk, (19.23) 

(ds) 2 = du.idu 1 . (19.24) 

Since (ds) 2 is a scalar, all of the quantities on the right-hand sides are also 
scalars. It should also be noted that in (19.22) and (19.23), the du l (or du k ) 
are contravariant (or covariant) components of a vector. Hence, in view of the 
quotient theorem regarding two-index quantities (see Sect 18.4.7), it turns out 
that the symmetric quantities gtk (or g lk ) form covariant (or contravariant) 
components of a second-order tensor. 


19.2.3 Riemann Space and Metric Tensor 

We have seen that in terms of tensor calculus, the metric tensor g rather than 
the local basis vectors e.j and e° is a more fundamental object in determining 
geometric properties of the space being considered. In fact, an abstract space 
of points to which we assign a certain class of a second-order tensor g at each 
point is referred to by a special name as stated below, which gives a formal 
definition of the metric tensor g in the language of tensor calculus. 
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4 Riemann space: 

A finite-dimensional space of points labeled by an ordered set of real 
coordinates u},u 2 ,- ■ ■ ,u n is called a Riemann space if it is possible to 
define two-index quantities (ji : j that possess the following properties: 

1. Each entity gijiu 1 , u 2 , ■ • • ,u n ) is a real single- valued function of the 
coordinates and has continuous partial derivatives. 

2. g ik {u l ,u 2 ,--- ,u n ) = g ki {v}- ,u 2 ,■ ■ ■ pit"). 

3. g = det [g ik ] ± 0. 

The tensor g formed by the two-index quantities gij noted above is called 
a metric tensor of the space. 


I Remark. Note that the above definition of a metric tensor is free of the concept 
of local basis vectors. 

In this context, the superscripted components g 13 are defined by 

r^ik 

g ik g kj =5) or g ik = — , 

where C lk (= C kl ) is the cofactor of g i k in the determinant g = det[< 7 ifc]. (See 
Exercise 2 for the proof of the above.) 

Our familiar Euclidean space is a particular class of Riemann space as 
stated below. 


4 Flat Riemann space: 

A Riemann space is flat if and only if it admits a system of rectangular 
Cartesian coordinates x 1 , x 2 , ■ ■ ■ ,x n such that at every point of the space, 

( ds ) 2 = E\ (cfe 1 ) 2 + £2 (dx 2 ) 2 + ••• + £„ ( dx n ) 2 , (19.25) 

where each £$ equals either +1 or —1. 

4 Euclidean space: 

A Euclidean space is a flat Riemann space for which all £, in (19.25) 
are equal to +1. 


19.2.4 Elements of Arc, Area, and Volume 

Below we describe several useful relations in connection with the elements of 
arc length, areas, and volumes in terms of metric tensors. 
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1. Element of arc length: 

The element of arc length ds,; along a particular coordinate curve u l with 
fixed i is 

dsi = \dr\ = \e.i\du 1 = -^/e* • e.;dit* = i/gudzi 1 (no summation over i). 

2 . Element of area 

The element of area da i in the coordinate surface u 1 = const; for instance, 
reads 


da\ = \dr 2 x dr 3 | = |e 2 x e 3 | du 2 du 3 
= \J (e 2 x e 3 ) • (e 2 x e 3 ) du 2 du 3 
= \/ (e 2 • e 2 ) (e 3 • e 3 ) - (e 2 • e 3 ) (e 2 • e 3 ) du 2 du 3 


= \j 922933 - (to) 2 du 2 dis- 


similarly, we have 


da 2 = 
da 3 = 

which are summarized by 


533511 


511522 


(313) 2 du 3 du 1 , 


(512) 2 du 1 du 2 , 


dai = \J Sjjgkk — (gjk) 2 du-’ du k (no summation over j and fc), 

where i, j , fc is a cyclic permutation of the numbers 1, 2, 3. 

3. Element of volume 

Finally, we can derive the equation for the element of volume as 
dV = |(dri x dr 2 ) • dr 3 | = |(ei x e 2 ) • e 3 | d^d^du 3 
= gdu 1 du 2 du 3 , 

where g = clet - [Proof of the identity (e 3 x e 2 ) • e 3 = g is given in 
Exercise 2.] 

Our results are summarized as: 


4 Theorem: 

Elements of arc length ds-i , area do - ,, and volume dV, respectively, are 
represented in terms of curvilinear coordinate systems by 

dsi = y/gudu 1 (no sum over i), 
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dai = \J gjjgkk — (gjk) 2 du^du k (no sum over j and fc),and 
dV = yj~g du 1 du 2 du 3 , 

where i,j, k is a cyclic permutation of 1, 2, 3. 


19.2.5 Scale Factors 

In this subsection, we consider the case of orthogonal coordinate systems, for 
which the basic descriptive quantities are the scale factors (or the metric 
coefficients) hi, hi, h 3 , defined by 

hi = \fgri, ft -2 = \fg 22 : h 3 = \fgzz- 

Obviously, they satisfy the equation 

( ds ) 2 = (hidu 1 ) 2 + (h 2 du 2 ) 2 + ( h 3 du 3 ) 2 . 

Furthermore, since g t j = 0 for i ^ j, we have 

dsi = h^du 1 (no sum over i), 
d<7i = hjhkdu-i du k (no sum over j and k), 
dV = hih 2 h 3 du 1 du 2 du 3 , 
where i,j, k is a cyclic permutation of 1, 2, 3. 

Examples 1. In rectangular Cartesian coordinates, 

(ds) 2 = (dx) 2 + ( dy ) 2 + (dz ) 2 , 


so 


hi — hi — h 3 — 1. 


2. In cylindrical coordinates, 


(ds) 2 = (dR) 2 + (Rdd) 2 + (dz) 2 , 


so 

hi = 1, hi = R, h 3 = 1. 

3. In spherical coordinates, 

(ds) 2 = (dR) 2 + (Rdd) 2 + (Rsinddcl)) 2 , 


so 


hi = 1, hi = R, hz = Rsm9. 
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19.2.6 Representation of Basis Vectors in Derivatives 


It is often desirable to represent local covariant basis vectors e* as well as 
components of metric tensors gtj = e, • ej at a point r in terms of derivatives 
of r with respect to coordinates the u l . 

Suppose the relation between a system of curvilinear coordinates u 1 , u 2 , u 3 
and an underlying system of rectangular coordinates Xi,X 2 ,X 3 ( = x,y,z) is 
given by 

u l = u l (xk) and xj. = Xk(u l ), (19.26) 


where the Jacobian 


J = 


du 1 

dx k 


is neither zero nor infinite. Writing the latter equation in (19.26) more con- 
cisely as 

r = r(td), 


where r = Xkik is the position arrow of an arbitrary point, we find 


dr 



It then follows that 


c)v f) V 

(ds) 2 = dr ■ dr = — — • — — rdu l did 
ou l OuJ 


which implies that the vectors of the local basis are 


e* = 


dr 

du l 


and the metric tensor is 

dr dr dxk . dxf. . dxk dxt. . . . . dxk dxk 

du l dui du l k dui e du 1 dui k e du 1 dui 

This leads to the following expression for the scale factors (for the case of 
orthogonal coordinate systems): 


hi = 




19.2.7 Index Lowering and Raising 

In curvilinear coordinate systems, it is possible to express a scalar product 
of two vectors via several different subscript forms. For instance, the scalar 
product of two vectors a and b may be written using their contravariant or 
covariant components: 



618 


19 Non-Cartesian Tensors 


a b = a l e.i ■ Vej = gijd l V (19.27) 

and 

a b = cue 1 ■ bj e 3 = g l3 dibj. (19.28) 

Furthermore, we may express the scalar product in terms of the contravariant 
components of one vector and the covariant components of the other: 

a b = die 1 ■ tfej = diV Sj = dib 1 (19.29) 

and 

a ■ b = a l ei ■ bje 3 = a'bjSj = oO'bi . (19.30) 

By comparing the four alternative expressions (19.27)-(19.30) for a ■ b , 
we can deduce the following useful property of gij and g lJ . From (19.27) and 
(19.30) we see that the identity 

gijtfbi = a l bi 

holds for any arbitrary a*. Hence, we have 

gijb J = bi . (19.31) 

which illustrates the fact that the covariant components Qij can be used to 
lower an index b 3 . In other words, it provides a means of obtaining the co- 
variant components bi of a vector from its contravariant components b J . By a 
similar argument, we have 

9% = b\ 

where the contravariant components g lJ are used to raise the index j attached 
to bj. 

4 Index lowering and raising (I): 

For any vector a, its components a 1 and di are related via the compo- 
nents of the metric tensor as 

di = gikd k and a j = g je de. 

The above discussion regarding vectors can be extended to tensors of arbi- 
trary rank. For example, the contraction with g^ results in a lowering of the 
corresponding index: 

T;j - g, k T‘. j~ g ik T; k . (19.32) 

Here the dots (•) in the mixed components emphasize the order of occurrence 
of the indices; in fact, in general, T * 1 ^ T l t Repeated contraction with g^ 
yields 

Tij = gikgjiT kl . 
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Similarly, contraction with g lJ raises an index, i.e. , 

T ij = g ik T* j = g ik g jl T kl . (19.33) 

Comparable arguments are applicable to local basis vectors e, and as 
stated below. 


4 Index lowering and raising (II): 

Local basis vectors e* and e k are related as 

e t = g ik e k and e 3 = g ]l e t . 


Proof Since a = a l e t = tij e 3 = a k g k je -> , we have 

a 1 (ei - gije J ) + a 2 (e 2 - gije 3 ) + a 3 (e 3 - g^e 1 ) = 0, 

which holds for any vector a. Hence, e k — gkj£ k — 0 for all k, i.e., 

— gkje j . 


Similarly, we have 

e k = g kj ej . * 


Exercises 


1 . Show that the quantities gij = e* • e ; form the covariant components of a 
second-order tensor. 

Solution: In the new (primed) coordinate system we have 

gk = e! i ■ e'j. Using the transformation law (19.7) of covariant 
basis vectors, we have 

, f du k \ f du l \ du k du l du k du l 

9ij = \fa? ek ) ' = d^d^ 1 (6fc ' = M I d^ j9U ' 

This clearly indicates that the g ^ are covariant components of a 
second-order tensor (i.e., the metric tensor g). A similar argument 
shows that the quantities g l ° form the contravariant components 
of g , which transform as follows: 


g 


tij 


dvTdu^ kl 
du k du l 9 
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2. Show that the matrix [g 1 ^] is the inverse of the matrix [g l3 ] . 

For an arbitrary vector a, we find a* = g^aj = g 1 ^ g.j k a k . Since a is 
arbitrary, we must have 


9 ij m = S i k = l [ 1 0 i i ^ (19.34) 

This clearly indicates that the matrices [g-ij] and [g l: '] are inverse to each 
other. X 

3. Show that yj~g = (g* x ej) • e k and 1/yd) = (e 1 x e- 7 ') • e fe , where i,j, k is a 
cyclic permutation of the numbers 1,2,3. 

Solution: By direct calculations, we obtain 

a u = e i . e e = ( e r x e k) ' (e m x e n ) 

9 [e i -(e j xe k )][e e -(e m xe n )y 1 j 

where i,j, k and £, m, n are cyclic permutations of the ordered set 
of numbers 1,2,3. The numerator in (19.35) reads 


(Gj X Gfc) • (G m X G n ) [(Gj X Gfc) X 6 m ] * G n 
— [(^b ' Gm) Gfc (g& • 6 m ) Gj] • C n 

(Gj • G m ) (Gfc • G n ) (Cfc ' G m ) (Gj • G n ) 


m 


9jm 9 km 

' €-n 


9jn 9kn 


Here, is the cofactor of gu in the determinant g = det \gu]- 
Comparing the results with the definition g li = C lt /g, we find 
that 

g = [e.j • (ej x e k )] [e t ■ (e m x e n )} , 
which is equivalent to 

3 = [e, ; ■ (e_j x e fc )f , i.e., y/g = ±6* • (e,- x e k ) , 


where the plus sign is chosen if the given basis is right-handed. 
In a similar manner, via the relations gn = Cm and det[<5/] = 
det[< 7 ijgi k ] = det[ 5 y]det[gf jfe ] = 1, we obtain 

— = ±e l ■ (e - 7 x e fc ) . X 

y/g 
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19.3 Christoffel Symbols 

19.3.1 Derivatives of Basis Vectors 

Several new concepts are required for the differentiation of vectors or tensors 
with respect to curvilinear coordinates. Recall that in a general curvilinear 
coordinate system, the basis vectors e, and e* are functions of the coordinates. 
This implies that differentiation of vectors (say, v = u*ej) or tensors (say, 
T = T/e* (g) e.j) involves their derivatives, such as dei/duE 

Suppose that the derivative de.i/du J can be written as a linear combination 
of the basis vectors e k as denoted by 


dej 

dvJ 



(19.36) 


the symbol E k being the coefficients associated with the fcth component of 
the linear combination. Using the reciprocity relation e* • e.j = <5® , we write 
this as 


r k = e k • — 
ij did ' 


(19.37) 


This three-index symbol is called a Christoffel symbol. In a similar manner 
as above, we can show that the derivative of the contravariant basis vectors 
reads 


de l 

~dui 




(19.38) 


Details of the derivation are given in Exercise 1 . 

We shall see that Christoffel symbols play a key role in defining the deriva- 
tives of vectors and tensors in terms of general coordinate systems. A more 
formal definition of Christoffel symbols in terms of metric tensors is given in 
Sect. 19.3.4. 


Remark. It is clear from (19.37) that in Cartesian coordinate systems, rk = 0 
for all values of the indices i, j, and k, owing to the identity: deijdv? = 0. 


Example J.. Let us calculate the Christoffel symbols T,™ for cylindrical coor- 
dinates, where (u , it , u 3 ) = and the position vector r of any point 

may be written 

r = p cos (j)i + psin <f)j + zk. 

From this we find that the covariant basis vectors are given by 


e 


p ~ 


dr 

dp 


cos 4> i + sin <f>j , 


— 


dr 

d(j> 


(19.39) 


psin <\>i + pcos(j)j , 


(19.40) 
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6z = Tz = k ( 19 - 41 ) 

It is a straightforward mother to show that the only derivatives of these vectors 
that are nonzero with respect to the coordinates are 
de p 1 de $ 1 de $ 

dcj> - p e ^ dp ~ p e 01 d<t> ~ pep - 

Thus, from (19.37), we immediately have 

“ d 74 <I9 - 42) 

19.3.2 Nontensor Character 

Despite their appearance, the Clrristoffel symbols J- do not form the compo- 
nents of a third-order tensor. 

6 Theorem: 

Clrristoffel symbols do not form any kind of tensor. 


Proof This is verified by considering their transformation behavior under a 
general coordinate transformation. In a transformed coordinate system, we 
have 


i-i/fc _ / k de! i 
ij ~ ‘ du /j ' 


(19.43) 


Applying the transformation law of local basis vectors, we obtain 


7~l 

1 ij ~ 


du' k \ 

d / du 1 

e 

du n J 

du'-’ V du ' 1 

du ,k \ 

f d 2 u l 


du r ‘ 


du' l du' ] 


ei 


/ du l dei \ 

V du ' 1 du 1 -’ J _ 


du d 2 u l 
~dvT du rl du' j 


(e n ■ ei) + 


du ,k du 1 
'dvPdv / 1 


dei \ 
du ’’ / 


du' k d 2 u l du' k du 1 du m / n dei \ 

~duP du' l du n 1 + '^dv/ I dv ;j \ 'dvr) 

du' k d 2 u l du' k du 1 du m 
~dvT du rl du' j + ^d^fotP lm ' 


(19.44) 


Hence, the presence of the first term in the last line in (19.44) prevents the 
rs from forming a third-order tensor. £ 
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19.3.3 Properties of Christoffel Symbols 


Christoffel symbols P/) satisfy the following relations: 


-I T~'K rtfC 

-L. 1 ij — 1 

2- 


3. 


du k 

dg ij 

du k 




= -g u ri k -g jt i1 k . 


4. 


r tk = VW\ 


1 gt/M 

yffi dui 


Proofs of these relations are given in Exercises 2-4. 


Remark. Some textbooks refer to our three-index symbol Pd defined by (19.37) 
as the Christoffel symbol of the second kind and use the following no- 
tation: 



<9e, 

= e k ■ — 4 . 
du9 


(19.45) 


As a counterpart, we may define the Christoffel symbol of the first kind 
[k, ij] by 

r)p 

[h,ij]=e k --^. (19.46) 


Note that the index k on the right-hand side of (19.46) is a superscript , whereas 
that of (19.45) is a subscript. These two kinds of Christoffel symbols are related 
to each other as 



= 9 k % ij]- 


19.3.4 Alternative Expression 

In principle, we can calculate the P/( in a given coordinate system using the 
expression (19.37) based on e». However, it is simple to use an alternative 
expression in terms of the metric tensor gij and its derivatives as stated below. 

4 Theorem: 

Christoffel symbols are expressed as 

pm _ }_ n mk ( ®9jk dg k j _ dgij \ 

y 2 ^ \ did Out du k J ' 
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Proof Recall the relation 

^L=g ej r? k + g u rt k (19.47) 

given in Sect. 19.3.3. By cyclically permuting the free indices i, j, k, we obtain 
two further equivalent relations: 


and 




(19.48) 

(19.49) 


Then, subtracting (19.47) from the sum of (19.48) and (19.49), we find 

dgjk dg k j _ dg.jj 

du 1 dui du k 

= r^gek + r ki gje + r k jgei + r^gu — ri k g tj — Pj k gu 

= + r* j9kt ) + ( r ki g je - r^gej) + (■ r kj gu - r^gu ) 

= 2/^ + 0 + 0 =2 rf j9ki , (19.50) 

where we have used the symmetry properties: gij = gji and P^ = 1 ' f . Con- 
tracting both sides with g mk yields 


n mk 


( d 9jk 

V du l 


dg k i 

du'i 




''ry u - 2 - 2 n 


i.e., 


T m = —o' 

*7 O V 


mk 


dgjk 

du 1 


dg ki _ dgjj \ 

<9?+ <9u fc y 


(19.51) 


This result enables us to compute the Clrristoffel symbol of a given coordinate 
system from information about the metric tensor. 


Examples We again evaluate the Clrristoffel symbols T™ for cylindrical co- 
ordinates. Using (19.51) and the fact that gu = 1, g 22 = P 2 , 533 = 1 and 
the other components are zero, we see that the only three nonzero Clrristoffel 
symbols are indeed rf 2 = P 2 \ and r 22 . Given by 


71 2 7“i2 

1 12 — 1 91 — 


1 %22 1 


■ 21 


r 1 — ~ 
1 22 — 


1 


2g 22 du 1 
dg 22 


2 p 2 dp 

d 


d o 1 
P = 

P 


2 gu du 1 2 


p 2 = 


(19.52) 


they agree with the expressions in (19.42). 


(19.53) 
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Remark. The result (19.51) implies that the Christoffel symbol of the first 
kind [k, ij] mentioned in (19.46) is written as 


[k, ij] 


1 f dgjk dg ki 

2 \ du l dui 


d 9ij \ 
du k ) 


Exercises 


1. Derive equation (19.38). 

Solution: By differentiating the reciprocity relation e l ■ ej = 5 ] 

with respect to the coordinates, we have 


de 1 i de.i _ 08] 

du k J du k 0u k 


(19.54) 


The right-hand side of (19.54) vanishes since the element 5] con- 
sists of the contants +1 and 0, which are independent of the co- 
ordinates u k . Hence, using (19.37), we obtain 


de 1 
du k 


e j + r jk ~ °- 


(19.55) 


Similar to the case of (19.36), for the moment we write the deriva- 
tive de 1 /dui as a linear combination of the basis vector e l as 


de 1 

du k 


= Bf k e e . 


(19.56) 


2 . 


Substituting (19.56) into (19.55), we obtain B J ik = —T\. Conse- 
rve* 

quently, we have — —r = —TLe 1 , or equivalently (by interchanging 
ou k 

the subscripts), 


de 1 
Ou 1 


-n 3 e k 


* 


Show that r k , = r k ,. 

L J J L 

Solution: It follows that 
de.i 


dej _ 

Oui ~ 
de. 


d dr 
Oui du * 


d dr 
du 1 Oui 


oe J i • i 
—4 , which 

du 1 



626 


19 Non-Cartesian Tensors 


Solution: Derivatives of the metric tensor g l3 = e $ • e :j with 

respect to u k read 


Ofjij 

du k 


dei dej 

du k J 1 du k 


^ik e H ' e i + e i ' f % e t 


— + I jkQit • ^ 


4. Show that = gg 31 -^, where g = det^]. 

Solution: We know that the determinant g is given by (see 

Sect. 18.1.7) 

n 

g = ^ (with i fixed), 

i= i 

where C*- 7 is the cofactor of the element gij in g. Partially differ- 
entiating both sides with respect to g-ij gives 

= C nj . (19.57) 

VQij 


Since = C *- 7 / g (see Sect. 19.2.4), it follows from (19.57) that 


dg 

du k 


if g %i = r ii d 9ij = g d 9ij 

r\ r\ U ^ Ot JJ r\ U ' 

ogij ou K ou K ou K 


* 


5. 


d 

Show that r k k = — log Jg. 

Solution: According to the expression (19.51), we have 


1 a ( dgu dgu _ dgu \ 
2^ \du k du i du e J ' 


The last two terms in the parentheses cancel out because 
u dgu H^gki itdg ki 

g 9 ^ p g o r > 

ou 1 ou * ou 1 

where we have interchanged the dummy indices i and l in the first 
equality, and have used the symmetry of the metric tensor in the 
second. Hence, we set 


r* - 

1 ki ~ 


9 dgu 


(19.58) 


2 du k 

This can be further simplified by using the result of Exercise 4 as 


n = 

ki 2 g du k 


^_dg_dyd[ = = i r X 

2 g dy/g du k ^Jg du k du k ^ 
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19.4 Covariant Derivatives 


19.4.1 Covariant Derivatives of Vectors 


The derivatives of a scalar in terms of Cartesian coordinates work as covariant 
components of a vector. This is also true for the case of general coordinate 
systems, as can be shown by considering the differential of a scalar 

# = ^ ^-du 1 . 

Since the du 1 are contravariant components of a vector and d<f> is a scalar, 
we see from the quotient law that the quantities d(f>/du l must form covariant 
components of a vector. 

Except for a scalar, however, the derivatives of a general tensor do not 
necessarily form the component of another tensor. To see this, we consider 
the derivative of the covariant components v l of a vector v with respect to a 
general coordinate u 3 . In a new (primed) coordinate, it reads 

dv ,l du k dv ri du k d 

did 1 ~ ~dvP~du^ ~ 



du k du! 1 dv 1 du k d 2 u' 1 , 

7 1 7 V. 

du 13 du 1 du k du 13 du k du l 


(19.59) 


The presence of the second term in the last line of (19.59) prevents the deriva- 
tive dv z /dx 3 from obeying the transformation law of the components of a 
second-order tensor. The nontensor character stems from the fact that the 
second-order derivative, 


d 2 u ri 
du k du l ’ 


(19.60) 


involved in the last line of (19.59) does not vanish. In fact, the first-order 
derivative du' 1 /du 1 is not constant in non-Cartesian coordinates, whereas it 
is constant in Cartesian coordinates [so that the term (19.60) vanishes in the 
latter case]. 

In the context above, it is natural to introduce a new class of differentia- 
tion that turns the derivatives of components of a tensor into components of 
another tensor. This is achieved with the help of the Christoffel symbols dis- 
cussed in Sect. 19.3. Let us consider the derivative of a vector v with respect 
to the coordinates u 3 . We find 


dv 

~dud 


dv 1 , dei 

d^ E ' + V d^ 


(19.61) 


where the second term arises because, in general, the basis vectors are not 
constant. Using (19.36), we write 
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dv 

dvJ 


dv 1 , k 

d^ e - +vr “ ei - 


Since i and k are dummy indices, we may interchange them to obtain 


dv 

dui 


dv 1 
dui ' 


p 

1 


dv 1 

did 


+ v r, 


kj 


(19.62) 


The quantity in parentheses is referred to specificcally as the covariant 
derivative of a vector: 


4 Covariant derivative of a vector: 

The quantities defined by 


v i + r v k 

1 ■■■' dv.’ 1 ■ 


(19.63) 


are called covariant derivatives of contravariant components v l of a vec- 
tor v with respect to it- 7 . Here, the semicolon subscript on the left-hand 
side denotes covariant differentiation. 


Using this notation, we may write the derivative of a vector in the very 
compact form 


dv , 
dvJ ~ V 'i ei ' 


The corresponding result for the covariant components ry can be found in a 
similar way by considering the derivative of v = tye 4 and using (19.38) to 
obtain 


Vi-j 


dvj 

dui 


7-lfc 

r ij V k- 


(19.64) 


19.4.2 Remarks on Covariant Derivatives 

1. The arrangement of indices i,j, k in the Clrristoffel symbols in (19.63) and 
(19.64) can be determined systematically in the following manner. First, the 
index to which the derivative is taken (i.e., j in this case) is the last subscript 
on the Clrristoffel symbol. Secondly, the other index appearing on the left- 
hand side (i.e., i in this case) also appears in the Clrristoffel symbol on the 
right-hand side without raising or lowering. The remaining index can then be 
arranged in only one. 




19.4 Covariant Derivatives 


629 


2. Similar to v* •, a comparable short-hand notation for partial derivatives is 
obtained by replacing the semicolon by a comma such as 



and Vi j = 


dvj 

dvj' 


3. In Cartesian coordinates, all the T kj are zero, so the covariant derivative 
reduces to the simple partial derivative, say, v l .j = v l :j . 


19.4.3 Covariant Derivatives of Tensors 


Covariant derivatives of higher-order tensors can be defined by a procedure 
similar to the one for vectors. As an example, let us consider the derivative of 
the second-order tensor T with respect to the coordinate u k . Expressing T in 
terms of its contravariant components, we have 


?L = JL (T ij t 

du k du k ' 


dT‘J .. <)e, 

~K—jr e i ® ej + T 3 — — ^ 
air air 


) ej + T l °ei 


^ <“.65) 


Using Christoffel symbols, we obtain 


0 + Tijr ^ ei 0 + Tiiei 0 r * ei - 

Interchanging the dummy indices i and l in the second term and j and l in 
the third term on the right-hand side, we set 


dT 

du k 




Gi C) Gj , 


where the expression in parentheses is the required covariant derivative defined 

by 


n = 


dT lJ 

du k 


r, 


Ik 


T lj + r{ k T u 


(19.66) 


Using the notation (19.66), 
respect to u k as 


we can write the derivative of the tensor T with 


dT 

du k 


T% ei ® e 


]■ 


Results similar to (19.66) can be obtained for the covariant derivatives of 
the mixed and covariant components of a second-order tensor. Collecting all 
of these results leads to the following: 
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4 Covariant derivative of a tensor: 

Covariant derivatives of components of a second-order tensor T are given 

by 

n - n . /)v/° • i at 1 . 

rpi rpi pi r pl pi 'pi 

1 j -,k ~ 1 j ,k~^ 1 Ik 1 j 1 jk 1 1 i 

rp rp p£ rp p£ rp 

1 ij ;k ~ -L ij ,k 1 ik 1 ^ 1 jk 1 Hi 

where the comma notation means the taking of partial derivatives. 


The position of the indices in the expressions is very systematic. We focus on 
the index i or j on the left-hand side. First, the index k to which the derivative 
is taken should be the last subscript on the Clrristoffel symbol. Next, if the 
index (i or j ) on the left-hand side is a superscript, then the corresponding 
term on the right-hand side containing a Clrristoffel symbol is attached to a 
plus sign. In contrast, when the index on the left-hand side is a subscript, 
the corresponding term on the right is attached to a minus sign. We can 
extend this in a straightforward manner to tensors with an arbitrary number 
of contravariant and covariant indices. 

Remark. 

1. All of the quantities T tJ k , Tj fc , and T l3 . k are the components of the same 
third-order tensor VT with respect to different tensor bases, i.e., 

VT = T lj k ei (g ) ej <8 e k = Tj ;fc ej <8 e J <8 e k = T t j ;fc e* <8 e j 8 e k . 

2. In general, we may call the u’ • the covariant derivative of v and denote 
it by Vv. In Cartesian coordinates, its components are just dv l jdxj . 

3. Given a metric tensor g , the covariant derivatives of its components, g t j- t k 
and g tJ . kl are identically zero in terms of arbitrary coordinates. This is 
called Ricci’s theorem, for which we give the proof in Exercises 2 and 3). 


19.4.4 Vector Operators in Tensor Form 

This subsection is devoted to finding expressions for vector differential opera- 
tors such as grad, div, rot, and the Laplacian in tensor form that are valid in 
general coordinate systems. In principle, they are obtained in a straightfor- 
ward manner by replacing the partial derivative given in Cartesian coordinates 
with covariant derivatives. These tensor forms, however, can be simplified by 
using the metric tensor g.j 3 as shown below. 
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1. Gradient: The gradient of a scalar <j> in a general coordinate system is 
given by 

V0 = 0 ;i e < = J^e\ (19.67) 

since the covariant derivative of a scalar is the same as its partial deriva- 
tive. 

2. Divergence: The tensor form of the divergence of a vector v is given by 


v • V = v \ = — + r ki v k . 
' l du l kl 


(19.68) 


Observe that the index i appears twice in the Clrristoffel symbol. Using 
the expression (see Sect. 19.3.3) 




we obtain a more compact form: 
dv l i. dv l 


du l 


■ r l nr - 

lkiV - du < 


i ayg 

y/g du k 


3. Laplacian: The tensor form of the Laplacian V 2 ^ is obtained by making 
use of the following relation: 

v\i = V • v = V • (V0) = V 2 ^, 

where we assume that v = V<j>. From (19.67), we have 

i _ , d(j) t 

Vi e =v = \Z<j ) =— i e . 

Thus the covariant components of v are given by 

dcj) 

Vi = ~dv}' 

and its contravariant components v 1 can be obtained by raising the index 
using the metric tensor: 

v j = g jk v k = 

ovr 


Substituting this into (19.69), we finally arrive at 

= = ±_L( v - gs „m 


= 


y/g d ip 


du k 1 


(19.70) 
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4. Rotation: In general curvilinear coordinates, the operation V x v is de- 
fined by 

[V x v]ij = Vi -j - Vj -i, (19.71) 

which forms covariant components of an antisymmetric tensor. The right- 
hand side of (19.71) can be simplified as 


- Vi = 


dvj 

dui 


- r > 


dvj 

du 1 


rfiv e 


dvi dvj 
dui du i ’ 


(19.72) 


where the Christoffel symbols cancel out owing to their symmetric prop- 
erties. Therefore, components of the tensor Vx» can be written in terms 
of partial derivatives as 


[V x v]ij 


dvi dvj 
dui du i 


(19.73) 


Our results are summarized as follows: 


4 Vector operators in tensor forms: 
*• 2 ' = 



4. [V x v]ij = 


dvj 

du'i 


dvj 
du * 


Exercises 

1 . Prove that the covariant derivatives v . • form a second-order tensor of type 

ab- 

solution: Employ the transformation laws of v k and [see 

(19.44)] to obtain 


k _ dv« 
V ; j duJ 


+ r k y 


d ( du 


pj l 
k 


du? \ du' q 

du" 


d 2 u k 


du' a du’ q dvj 


v' q + 


du k du' r du ,s 
du' q du p dui 

du k du'* dv' q 
du' q dui du n 


r: 


du k d 2 u' q 
du' q dvPdui ) \du' t 


\( 


du k f du' r du' s 
du' q I du p dui 


r + 

- 1 - rs i 


d 2 u" 


du p ,t 
du' tV ' 


du p , t 
v 


+ 


du p dui 


du p dui 


(19.74) 
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The sum of the terms involving second derivatives is zero; this 
is seen by taking a partial derivative with respect to u' q in the 
expression 

du k du ,a k 
du' a did ~ j ’ 

which yields 

d ( du k du' a \ d 2 u k du' a du k d 2 u' a 
du' q \du' a dui J du' q du' a dui du' a du' q dui 

d 2 u k du' a du k du m d 2 u ,a 

du' q du' a dui du' a du' q du m du : > 

= 0. (19.75) 

From (19.74) and (19.75), it follows that 

k du k du '* dv' q du k , r du' s , q t 

V ' J= d^ lhd~d^ l + lM qdt !hd rsV 
du k du' s (dv' q q , t 
du' q did \du /s + tsV 
du k du' s , q 

= fcrlhd V ;s ’ 

in which the last term in the last line, v ,q . s , represents the covariant 
derivative of v ,q with respect to the primed coordinates u' s . Hence, 
we see that the v k ;j form a second-order tensor of type (1, 1). X 

2. Show that the metric tensor is a covariant constant, i.e., the covariant 
derivative of any component is identically zero: g kp - j = 0. This result is 
known as Ricci’s theorem. 

Solution: It follows that 

9kp; j 9kp, j ~ Fkj9rp pj9kr 

= gkp, j — {djs, k + gsk, j — gkj, s ) — ( gjs , p + gsp, j — g P j, s ) 

= 0 . * 

3. Show that 6 k and g kp are also covariant constants. 

Solution: We have S k . . = S k . — r p -8 k — r k -8 p = 0, which com- 
pletes our first proof. Next, observe that S k = g PP g pk to find the 
identity 

0 = Si q = (g jp g pk ). q = g jp , q g pk + g jp g pk q . 

Since gj P is a covariant constant, the first term in the last expres- 
sion has the value zero. Multiplication by g : ' r produces the desired 
result. X 
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Remark. Owing to Ricci’s theorem and its two corollaries noted above, the 
components of the metric tensor can be regarded as constants under covariant 
differentiation. Thus, e.g., 

gif. A ■ k — {^g^A — A-i- k , 

g u T tm , k - ( guT enl ). fc = T™ . k , 

T ik -, tg im g kn = (T ik g m g kn ). £ = T mn , t , 


and so on. 


3. Use (19.70) to find expressions for V 2 </ and V • v in an orthogonal coor- 
dinate system with scale factors ft, (i = 1,2,3). 

Solution: For an orthogonal coordinate system = h-\ ft 2 ft 3 ; 

further, g vl = 1/ft 2 for fixed i and g = 0 for i ^ j. Therefore, 
from (19.70), we set 


V 2 <f» = 


1 d ( hih,2fi3 d(j> 


ft i /12 ft 3 dui 
In a similar manner, we have 


V • v = 


1 


d 


ftift 2 ft-3 dui 


ft 2 dui 


(ftift. 2 ft 3 ^)- * 


19.5 Applications in Physics and Engineering 

19.5.1 General Relativity Theory 

It cannot be denied that the general relativity theory is one of the most 
famous and beautiful applications of non-Cartesian tensor calculus in physics. 
This section outlines the concepts one needs in order to understand the general 
theory of relativity, which is necessary for obtaining the gravitational field 
equation and relevant tensorial quantities that are involved with the equation. 

Before proceeding to the argument, let us point out that the notion of 
geometric curvature is central to general relativity, which quantifies the 
curvature of space at any given point in the space considered. In Sect. 19.2.3, 
we learned that a space is a flat locally (or entirety) , if there exist coordinates 
x % such that the line element through a limited region (or the whole) can be 
written as 

(■ ds ) 2 = £i(dx 1 ) 2 , 
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where e = ±1. However, if we employ a different coordinate system x ' 1 , the 
line element ( ds ) 2 , in general, is not of the above form, but reads as 

(i ds ) 2 = gijdx l dx 3 

with the appropriate metric tensor g^. Hence, we require a means of identi- 
fying a flat space directly from the metric g^, independent of our choice of 
coordinate system. Such a coordinate-independent way of defining the curva- 
ture of a space leads to the field equation of gravity, i.e. , Einstein’s field 
equation, described in Sect. 19.5.4. 


19.5.2 Riemann Tensor 


The curvature of space can be quantified in a manner independent of the 
coordinate system by changing the order of covariant differentiation. Co- 
variant differentiation is a generalization of partial differentiation, in which 
interchanging the order of differentiation changes the result. To illustrate this, 
let us consider an arbitrary vector field with covariant components Vi . The 
covariant derivative of Vi is given by [see (19.64)] 


Vi-j 



A second covariant differentiation then yields 


( v i = 


dVi ; 


du k 
d 2 Vj 
duWu k 


— r m i, — r m v- 

1 ik v m ;j 1 jk u i ;r, 



vi- n 


\du k ) 


-l, 


ik 


dVm 

dud 


^ mj ^ 


- r 


jk 


dvj 

du m 


T~'<- 
^ i rn 


By interchanging the indices j and k to obtain the expression corresponding 
to ( Vi -k)-j and then subtracting the expression we set from the above relation 
gives us 

{ v i ~ ;fc) ; j = Rijk v ti 

where 


Dt 
n ij k 


an 


dr e . 

o 

dud du k 


ik 


rm T~\t rim ruf 

1 ik 1 mj 1 ij 1 mk ' 


(19.76) 


The quantity shown on the left-hand side is called the Riemann tensor 
(or curvature tensor). Since Christoffel sumbols Fd are functions of the 
metric tensor gij, (19.76) indicates that the Riemann tensor is defined in terms 
of the metric tensor and its first and second derivatives. 
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Recall that if the space being considered is flat, we may choose coordinates 
such that and its derivatives vanish. Therefore, we have 

L J ' 

Ri jk = 0 (19.77) 

at every point in the flat region. In fact, it is possible to show that (19.77) 
is a necessary and sufficient condition for the region of a space to be flat. 
Consequently, we conclude the following: when the Riemann tensor satisfies 
(19.77), it indicates that the region of a space is flat and when it does not 
satisfy (19.77), the region is curved. 

Two relevant quantities are obtained by contracting the Riemann tensor. 
One is the Ricci tensor defined by 


and the other is the scalar curvature (or Ricci scalar) given by 

R = gVRij = R\. 

These two quantities are important for introducing the Einstein tensor 

C" = R ij - \g ij R, 

which describes the space-time curvature in the field equation of general rel- 
ativity. 


19.5.3 Energy Momentum Tensor 

We now wish to determine the form of the gravitational field equation that, 
in the weak limit of a static gravitational field, reduces, to the classical 

Newtonian field of gravity described by 

V 2 <1> = 4tt Gp. (19.78) 

Here, # is the potential field that corresponds to the space-time curvature in 
relativistic theory, G is the universal gravitational constant, and p is the 
mass-density distribution of matter. Note that (19.78) is a form of Poisson’s 
equation with AnGp as the source term. This implies the presence of a corre- 
sponding source term associated with the space-time curvature in Einstein’s 
field equation. This source term is given by the energy-momentum tensor 
T y defined by 

7" = puV. 

Here, p is the density of matter, u l is the four-velocity represented by 
u z = (u ° , u 1 ,u 2 ,u 3 ) = where c is the velocity of light, v is the three- 

dimensional velocity (nonrelativistic) of a particle, and 7 = (1 — i> 2 /c 2 ) -1 / 2 . 
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The physical interpretations of the components of the energy-momentum ten- 
sor are: 

T 00 : the energy density of the particles. 

T°* : the energy flux (the heat conduction) in the ith direction. 

T l ° : the momentum density in the ith direction. 

T tJ : the flow of the zth-component momentum in the jth direction 
(i.e., the random thermal motions giving rise to viscous stress). 

19.5.4 Einstein Field Equation 

The parameters necessary to obtain Einsteinfs field equation, which relates 
the geometric space-time curvature to the density of mass-energy, are already 
on hand. One side of the equation should comprise the measure of the density 
of mass-energy, i.e, the stress-energy tensor 7), , and the other side should 
consist of a measure of the curvature involving the Ricci curvature Rjj and 
scalar curvature R. By making this equation consistent with Newtonfs equa- 
tion of motion in the limit of a weak gravitational force as well as with several 
postulates from a physical standpoint, Einstein’s field equation is obtained in 
the following form: 

Rij ~ 2 9ijR = c 4 T%j ■ (19.79) 

Given the matter source T t j , this tensor equation is composed of ten partial 
differential equations for the metric tensor gij (x) . Apparently, the tensor equa- 
tion is analogous to the Maxwell equations that determine the electromag- 
netic field given the charge and current densities (see Sect. 18.5.3). Unlike the 
Maxwell equations, however, the differential equations of gravitational theory 
are nonlinear, which make them very difficult to solve. Surprisingly, despite 
the nonlinearity, a number of exact solutions have been obtained owing to the 
presence of symmetries in space-time, which restrict the possible forms of the 
metric. 

Remark. Einstein’s field equation in (19.79) is the most fundamental equation 
in classical physics. The explicit form of the equation can be derived from a 
few arguments. However, it cannot be derived from other physical principles 
since there is no theory that is more fundamental. 



20 


Tensor as Mapping 


Abstract In this chapter, we show that tensors can be identified with mathemat- 
ical operators that transform elements from one abstract vector space to another. 
This viewpoint on tensors is apparently different from those presented in Chaps. 18 
and 19, where tensors have been identified as sets of index quantities subject to a 
transformation law under changes of coordinate systems. However, the viewpoint 
presented here turns out to be consistent with those presented in the previous two 
chapters when we introduce the concept of inner product into the abstract vector 
space (Sect. 20.3.4). 


20.1 Vector as a Linear Function 

20.1.1 Overview 

In Chaps. 19 and 20 tensors are defined as collections of index quantities 
that obey characteristic transformation laws under a change of coordinate 
systems. In this chapter we present an alternative definition of tensors; that 
does not require specification of a coordinate system, so that it is suitable 
for more general tensor analyses describing geometric properties of abstract 
vector spaces other than our familiar three-dimensional Euclidean space. 

The crucial point is that in this alternative definition, a tensor is considered 
not as a set of index quantities but as an operator (linear function or mapping) 
acting on vector spaces. For instance, a second-order tensor T is identified 
with a linear function that associates two vectors v and w with a real number 
c£ H, which is symbolized by 


T(v, w) = c. 

Emphasis should be placed on the fact that such a generalized definition 
of tensors applies to all kinds of general vector spaces (finite-dimensional), 
regardless of whether or not they possess geometric properties such as the 
distance, norm, or inner product of their elements (see Sect. 4.2.1). In 
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fact, the tensors we discussed earlier belong to a specific class of more general 
tensors, in the sense that they were defined solely on the threedimensional 
Euclidean space, a particular class of vector spaces endowed with the inner 
product property. However, we shall see that, the concept of tensor can be 
extended beyond inner product spaces by introducing the more general defi- 
nition referred to above. 

Throughout the following discussion, we restrict our arguments to finite- 
dimensional vector spaces over R in order to provide a minimal course for 
general tensor calculus. 


20.1.2 Vector Spaces Revisited 

To begin with, we briefly review the definition of abstract vector spaces. A 
vector space (or linear space) V over R is a set of elements called vectors 
that have two operations, addition and scalar multiplication, and a distin- 
guishing element 0 £ V. Here, addition (denoted by +) assigns to each pair 
of elements v,w £ V a, third element v + w £ V and the scalar multiplication 
assigns an element cv £ V to each v £ V and c £ R. By definition, all of the 
elements v,w,x £ V and all a,b £ R must satisfy the following axioms. 


1. The commutative law for +, i.e., v + w = w + v. 

2. The associative law for +, i.e., (v + w) + x = v + (w + x). 

3. Existence of identity for +, i.e., v + 0 = v. 

4. Existence of negatives, i.e., there is —v such that v + (— v) = 0. 

5. a{v + w) = av + aw. 

6. (a + b)v = av + bv. 

7. (ab)v = a(bv). 

8. lv ■= v. 

Given two vector spaces V and W, it is possible to set a function / so that 

/: V^W. 

The function / is called a linear function (or linear mapping) of V into 
W if for all v\, v 2 £ V and c £ R it yields 

f(v 1 + V 2 ) = /(^l) + f(v 2 ), 
f(cv 1 ) = cf(v 1 ). 

20.1.3 Vector Spaces of Linear Functions 

In elementary calculus, the concepts of vectors and linear functions are dis- 
tinguished from one another: vectors are elements of a vector space and linear 
functions provide a correspondence between them. However, in view of the 
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axioms 1 to 8 above, we observe that the set of linear functions /,<?,••• of V 
into W also forms a vector space in which addition and scalar multiplication, 
respectively, are defined by 

(f + 9)v = f(v) + g(v), (20.1) 

and 

(cf)v = cf(v), (20.2) 

where v £ V and f(v),g(v) € W. We denote by C{V,W) a vector space 
spanned by a set of linear functions / as 

f-.v^w. 

It is a trivial matter to verify that f + g and cf are also linear functions and so 
belong to the same vector space C(V, W). These arguments are summarized 
by the follwing important theorem: 

4 Vector space of linear functions: 

Let V and W be vector spaces. A set of linear functions / : V — > W 
forms a vector space denoted by C(V, W). 


This theorem states that the linear functions of V into W are 

elements of a vector space C{V,W), analogous to vectors ui,u 2 ,--- being 
elements of a vector space V . This analogy implies that a linear function 
f £ C{V,W) can be regarded as a vector and, conversely, that a vector v £ V 
can be regarded as a linear function. Such identifying vectors and linear func- 
tions is crucially important for obtaining a generalized definition of tensors 
that is free of the concept of inner product and the specification of a coordinate 
system. 


20.1.4 Dual Spaces 

Let V* denote a set of all linear functions such as 

/ : V -> R. 

(Note that the asterisk (*) in V* does not mean complex conjugate.) Then, 
since 

V = £(V,R), 

it follows that V* is a vector space. The vector space V* is called the dual 
space (or conjugate space) of V, whose elements f £ V* associate a vector 
v £ V with a real number c £ R, symbolized as 


f(v) = c. 
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Particularly important elements of V* are linear functions 
s i : V -► R (i = 1, 2, • • - ,n) 

that associate a basis vector e, £ V with the unit number 1. In fact, a set 
of such linear functions {V} serves as a basis of the dual space V* as stated 
below. 


6 Dual basis: 

For each basis {e;} for V, there is a unique basis {s- 7 } for V* such that 

e j (e t ) = S]. (20.3) 

The linear functions e- 7 : V — > R defined by (20.3) make up the dual basis 
to the basis {e.j} of V . 

Proof Let us verify that the set of {e- 7 } defined by (20.3) serves as a basis of 
V*. Recall that in finite dimensions, a basis of a vector space V is defined 
as a set of linearly independent vectors that spans all of V. To show linear 
independence, we assume that aj£° = 0. Then we have 

cij£ J (ei) = cijS? = a,i = 0 for all i, 

which implies that {e- 7 } is linearly independent. X 

Remark. Raising of the index j attached to £° is intentional, as this convention 
is necessary to provide a consistent notation of components of generalized 
tensors, demonstrated in Sect. 20.3. 


Examples Expand a vector v £ V as 

v = v l ei, 

to find that 

£ J (u) = (u*e,;) = v l £ i (e.j) = v l S J i = vP 

This indicates that e- 7 is the linear function that scans the jth component of 
v with respect to the basis {ej}. 

20.1.5 Equivalence Between Vectors and Linear Functions 

If V is a vector space and r £ V*, then r is a function of the variable v £ V 
that generates a real number denoted by t(v). Owing to the identification of 
vectors and linear functions, however, it is possible to reverse our reasoning 
and consider v as a function of the variable r, again with the real value 
v(t) = t(v). When we take this approach, v is a linear function on V*. 
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Remark. The two views contrasted above are both asymmetric, but this asym- 
metry can be eliminated by introducing the notation 

( , ) : V x V* —> R, 


which gives 


(v,t) = t(v) = v(t) € R. 


Here ( , ) is a function of two variables v and r, called the natural pairing 
of V and V* into R. It is easy to verify that ( , ) is bilinear. 


The concepts and notation introduced in Sect. 20.1.3 and 20.1.4 serve as 
preliminaries for the discussions in the following sections. 


20.2 Tensor as Multilinear Function 

20.2.1 Direct Product of Vector Spaces 

To arrive at the new definition of tensors we are seeking requires three more 
concepts, demonstrated in Sect. 20.2.1-20.2.2. 

The first is the direct product of vector spaces; if V and W are vector 
spaces, then we can establish a new vector space by forming the direct prod- 
uct (or Cartesian product) V x W of the two spaces. The direct product 

V x W consists of ordered pairs (v, w) with v £ V and w £ W, as symbolized 
by 

V x W = {(i>,ic) \ v £ V,w £ W}. 

The addition and scalar multiplication of the elements are defined by 
(v, «T) + (v, w 2 ) = (v, w 1 + w 2 ), 

(vi,w) + (v 2 ,w) = (v 1 + v 2 ,w), 

c(v,w) = (cv,w) = ( v,cw ). 

The linear dimension of the resulting vector spaces V x W equals the product 
of the linear dimensions of V and W. The elements (v, w ) of the direct product 

V x W is sometimes noted by v w. 

Remark. The reader should note a distinction between the direct product 

V x W and the direct sum V + W of the two vector spaces. A direct sum 

V + W consists of all pairs ( v , w) = (w, v) with v £ V and w £ W for which 
addition and scalar multiplication are defined by 

(■wij'uq) + (v 2 ,w 2 ) = (iq + v 2 ,w i + w 2 ), c(v,w) = ( cv,cw ). 

The linear dimension is thus equal to the sum of the dimensions of V and W. 
Every linear vector space of dimension greater than one can be represented 
by a direct sum of nonintersecting subspaces. 
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20.2.2 Multilinear Functions 

Let Vi, Vi and W be vector spaces. A function 

/ : Vi x V 2 -> W 

is called bilinear if it is linear in each variable, i.e., if, 

f(av i + bv' i, v 2 ) = af(vi,v 2 ) + bf{v\,v 2 ), 
f(vi ,av 2 + bv'i) = af(vi,v 2 ) + bf{v' 1: v 2 ). 

The extension of this definition to functions of more than two variables is 
simple. Indeed, functions such as 

/ : Vi x V 2 x • • • x V n -► W (20.4) 

are called multilinear functions, more specifically n-linear functions, for 
which the defining relation is 


,avi + bv'i,--- ,v n ) = af(yi, ■ ■ ■ ,v ir -- ,v n ) 

+ bf(v !,■■■ ,v n ). 

An n-linear function can be multiplied by a scalar and two n-linear func- 
tions can be added; in each case the result is an n-linear function. Thus, 
the set of n-linear functions given in (20.4) forms a vector space denoted by 
£(Vi x • • • x V n , W). 


20.2.3 Tensor Product 

Suppose that r 1 £ V* and r 2 € V 2 , i.e., r 1 and r 2 are linear real-valued 
functions on V\ and V 2 , respectively. We can then form a bilinear real- valued 
function such as 

r 1 <g> r 2 : V x V 2 -► R, 

which is represented by 

r 1 ® r 2 (ui ,v 2 ) = t 1 (ui)t 2 (u 2 ). (20.5) 

Note that the right-hand side of (20.5) is just the product of two real numbers: 
r 1 (ui) and t 2 (v 2 ). The bilinear function r 1 ®r 2 is called the tensor product 
of r 1 and t 2 . Clearly, since r 1 and r 2 are separately linear, so is r 1 (g)r 2 . Hence, 
the set of the tensor product r 1 ® r 2 forms a vector space C(V\ x V 2 , R). 

Recall that the vectors v £ V can be regarded as linear functions acting 
on V* . In this context, we can also construct tensor products of two vectors. 
For example, let V\ £ Vj and v 2 £ V 2 and define the tensor product 


ui ® v 2 : V* x Vi -> R 
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by 

v 1 ®v 2 (t 1 ,t 2 ) = v 1 (t 1 )v 2 (t 2 ) = t 1 (v 1 )t 2 (v 2 ). (20.6) 

This shows that the tensor product Vi <g» v 2 can be considered a bilinear 
function acting on V{' x V 2 , similar to r% ® t 2 being a bilinear function on 
V\ x V 2 , which indicates that the set of t>i ® v 2 form a vector space £(V* x 
F 2 *, R). 

Furthermore, given a vector space V, we can construct mixed types of 
tensor products such as 


v®t: V* xV-^R 

given by 

v 0 t(4>,u) = v(</>)t(u) = (J)(v)u(t), (20.7) 

where u,v £ V and 4>,t £ V* . In a straightforward extrapolution, it is possible 
to develop tensor products of more than two linear functions or vectors such 
as 

Ul ® V 2 ® • • • V r ® T 1 ® T 2 ® • • • T s , (20.8) 

which act on the vector space 


V* x h"* x • • • x 1/* x V x V x ■ ■ ■ x V, 

where V* appears r times and V s times. Similar to the previous cases, the 
set of tensor products (20.8) forms a vector space denoted by 

C[(V*Y xV s , R], 

where (V*) r and V s are direct products of V* with r factors and those of V 
with s factors, respectively. 


20.2.4 General Definition of Tensors 

We finally arrive at the following generalized definition of a tensor. 


4 Tensor: 

Let V be a vector space with a dual space V*. Then a tensor of type 
(r, s), denoted by Tf, is a multilinear function 

t; ■. ( v*y x (vr -> r. 

The number r is called the contravariant degree of the tensor, and s is 
called the covariant degree of the tensor. 
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4 Tensor space: 

The set of all tensors T s r for fixed r and s forms a vector space, called 
a tensor space, denoted by 

T/(V) = C[{V*) r x V s , R]. 


As an example, let Vi, ■ ■ ■ , v r G V and r 1 , • • • , t s £ V* and define the tensor 
product (i.e., multilinear function) 

Tf = Vi (g» • • • <g> v r <8» r 1 <g> • • • (8) r s , (20.9) 

which yields for 9 1 , ■ ■ ■ , 9 r € V* and U\, ■ ■ ■ , u s € V, 

V\ (g) • • • (g) V r (g) T 1 (g) • • • <g) T S ( 0 1 , • • • , 0 r , Ml, • • • , M s ) ( 20 . 10 ) 

= • • • M r (6» r )r 1 (Mi) • • • t s (m s ) 

r S 

= nri t,i ( 6 ' i ) rj ( M j)- 

i=tj=l 

Observe that each v in the tensor product (20.10) requires an element 6 £ V* 
to produce a real number 9 , which is why the number of factors of V* in the 
direct product (20.9) equals the number of m’s in the tensor product (20.10). 

In particular, a tensor of type (0, 0) is defined as a scalar, so 7)j° ( V ) = R; 
a tensor of type (1,0), an ordinary vector, is called a contravariant vector; 
and one of type (0, 1), a linear function, is called a covariant vector. More 
generally, a tensor of type (r, 0) is called a contravariant tensor of rank 
(or degree) r and one of type (0, s) is called a covariant tensor of rank (or 
degree) s. 

Remark. We can form a tensor product of two tensors T/ and Tj\ such as 

r; <g> u\ : ( V*) r+k x _> R 

which is a natural generalization of tensor products given in (20.5), (20.6), 
and (20.7). It is easy to prove that the tensor product is associative and 
distributive over tensor addition, but not commutative. 


20.3 Components of Tensors 

20.3.1 Basis of a Tensor Space 

In physical applications of tensor calculus, it is necessary to choose a basis for 
the vector space V and one for its dual space V* to represent the tensors by a 
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set of real numbers (i.e. , components). The need for this process is analogous 
to the cases of elementary vector calculus, in which linear operators are often 
represented by arrays of numbers, i.e., by matrices referring to a chosen basis 
of the space. A basis of our tensor space T s r (V) = C[(V*) r x V s , i?] is defined 
as follows. 


4 Basis for the tensor space: 

Let {e,} and {e- 7 } be a basis in V and V* , respectively. Then, a basis 
for the tensor space ( V ) is a set of all tensor products: 

ejj ® • • • ® e ir (g> e - 71 ® • • • ® e^ a . (20.11) 


4 Components of a tensor: 

The components of any tensor A £ 7^ r (V) are the real numbers given 
by 




■ ■ , W 


.)■ 


Remark. 1. A useful result of the theorem is the relation 

A = . 1' ; .;;'; e* <8 • • • (E> e ir ® £ jl ® • • • ® £ js . 

2 . Note that for every factor in the basis of there are N possibilities. 

(For instance, we have N choices for e,, in which i\ = 1, 2 • • • , N.) Thus, 
the number of possible tensor products represented by (20.11) is N r+S . 

Examples 1. A tensor space Tq(V) has a basis {e*} so that an element (i.e., 
a contravariant vector) v £ T 0 [ ( V ) can be expanded by 

v = v l ei, 

where the real numbers v 1 = v(e l ) are called the components of v : V — > R. 

2. A tensor space T®(V) has a basis {e- 7 } so that an element (i.e., a covariant 
vector) r £ T-[ ] ( V ) can be expanded by 

T = Tj£ J , 

where the real numbers Tj = r(e ; y) are called the components of r : V* — > R. 

3. A tensor space Ei{V) has a basis {ej®ej0e fe } so that for any A £ 7) 2 (V) 
we have 

A = A^ei ® ej ® £ k , 


where the real numbers 
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A ? 3 = A{e\e\e k ) 
are the components of A : V x V x V* — > R. 

20.3.2 Transformation Laws of Tensors 

The components of a tensor depend on the basis in which they are de- 
scribed. If the basis is changed, the components change. The relation between 
components of a tensor in different bases is called the transformation law 
for that particular tensor. Let us investigate this concept. 

Assume two different bases of V, denoted by {e.j} and { e'j }. Similarly, we 
denote by {e- 7 } and {s' 3 } two different bases of V*. We can find appropriate 
transformation matrices [R}] and [S)p that satisfy 

e! i = Rjej and e' k = S$s e . 

Then, for a tensor T of type (1,2), we have 

T’) k = T (V\ e'j, e'k) = T (S}e e , Rfe m , R n k e n ) 

= S\R™RlT{e t ,e m ,e n ) 

= S}R™R n k Ti n , (20.12) 

which is the transformation law of the components of the tensor T of type 

( 1 , 2 ). 

Remember that in the coordinate-dependent treatment of tensors (as 
shown in Chaps. 18 and 19), the result (20.12) was considered to be the 
defining relation for a tensor of type (1,2). In other words, a tensor of type 
(1,2) was defined as a collection of numbers that transform to another 
collection of numbers T'} k according to the rule in (20.12) when the basis is 
changed. In our current (i.e., coordinat e-free) treatment of tensors, it is not 
necessary to introduce a basis to define a tensor; a basis must be introduced 
only when the components of a tensor are needed. The advantage of this 
approach is obvious, since a (1,2)— type tensor has 27 components in three 
dimensions and 64 components in four dimensions, and all of these can be 
represented by the single symbol T. 

Remark. Note that the above arguments do not downplay the role of compo- 
nents. In fact, when it comes to actual calculations, we are forced to choose a 
basis and manipulate the components. 

20.3.3 Natural Isomorphism 

We comment below on an important property that is specific to components of 
tensors A £ T { [ (V). We know that tensors A £ ( V ) are bilinear functions 

such as 
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A : V* x V -> R 

and that their components A* are defined by 

A) = A(e\ ej), (20.13) 

where each e l and e 7 - is a basis of V* and V, respectively. Now we consider 
the matrix 

'M A\ ...Ai- 

[A}]= A \ , (20.14) 

Al ■■■ A n_ 

whose elements A'- are the same as those given in (20.13). We shall see that 
(20.14) is the matrix representation of a linear operator A in terms of the 
basis {e^} of V, which associates a vector v £ V with another u £ V, i.e., 

A : V —> V. 

A formal statement on concerning this point is given below. 

6 Natural isomorphism: 

For any vector space V, there is a one-to-one correspondence (called a 
natural isomorphism) between a tensor A £ 7) 1 ( V ) and a linear operator 

A e C{V,V)- 

Proof We write the tensor A £ Tf {V) as 

A = Ajd ® eP 

Given any v £ V, we obtain 

A(v) = ( A)ei ®e J ) (v) = A)ei [ff J (v)] 

= A) ei [e j (v k e k )\ = A'je, ( 'v k 5 { ) 

= A)v j ei. (20.15) 

Observe that the Afv : > in the last term are real numbers and that e,; £ V. 
Hence, the object A(v) is a linear combination of bases e, for V, i.e., 

A(v) £ V. 

Denoting A(v) in (20.15) by u = u l ei, we find that 

u l = A l jV\ 


( 20 . 16 ) 
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in which u l . v 3 are contravariant components of the vectors u,v GV, respec- 
tively, in terms of the basis {e^- The result (20.16) is identified with a matrix 
equation: 


1 

1 


i 

t-UM 

1 


i 

i 

u 2 

= 

A? 


V 2 

1 

s : 

3 

i 


An An . . . An 
_^1 ^2 _ 


1 

; g 

1 


(20.17) 


Thus we can see that given A G Tj 1 (U), its components form the matrix 
representation [A*] of a linear operator A that transforms a vector v G V into 
another vector u G V. 

Conversely, for any given linear operator on V with a matrix representation 
[A'-] in terms of a basis of V, there exists a tensor A G Tj 1 (V). This suggests 
a one-to-ome correspondence between the tensor space Tj 1 (V) and the vector 
space L{V, V ) comprising the linear mapping / : V — > V . X 


A parallel discussion serves for a linear operator on V* . In fact, for any 
r G V*, we have 


A(t) = (A)ei ® e 3 ) (r) = A) [e 4 (r)] e 3 
= Aj [ ei (r k e k )} e 3 = A* (r k S k ) e j 
= A'jTiS 3 . 

which means that A(t) is a linear combination of bases e 3 for V* , i.e. , 


A(t) G V*. 


Denoting A(t) by 9 = 0j£ 3 , we obtain 

9j = A)n, (20.18) 

where 0j and r, are (covariant) components of the vectors 9,t G V* in terms 
of the basis {e 4 }. Using the same matrix representation of [A*] as in (20.17), 
we can rewrite (20.18) as 


[01,- ■■ ,0n] = [n,.--- An] 


A} A\ ■ 
A\ 


■■ Al 


An An .. . An 
^1 /i 2 


which describes a linear mapping from a vector r G V* to another vector 
9 G V* through the linear operator with the matrix representation [Aj\. 

We thus conclude that there is a natural isomorphism among the three 
vecror spaces: 
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T^{V) = C{V* xV, R), £(V, V), and C(V*,V*). 

Owing to this isomorphism, these three vector spaces can be treated as though 
they are the same. 


20.3.4 Inner Product in Tensor Language 

As noted at the beginning of this chapter, our current discussion is applicable 
to any kind of vector space regardless of whether or not it is endowed with 
inner product properties. If the spaces we are dealing are inner product spaces, 
then all the results of Chaps. 18 and 19 are reproduced, owing the assumption 
that only a Euclidean space (i.e., a real inner product space; see Sect. 19.2.3) 
is considered there. In this subsection, we shall see that this is true, but we 
first have to introduce the concept of inner product in connection with our 
current vector spaces. Such a discussion enables us to make the correspondence 
between the two views of tensors clear: 

tensors are sets of index- quantities (in Chaps. 18 and 19), 


and 


tensors are linear mappings (in Chap. 20). 


Below is the definition of the inner product in the language of tensor 
calculus. 


4 Inner product: 

An inner product, denoted by ( , ), is a real- valued function such as 

(,):VxV^R, 
which has the following properties: 

(i) it is nondegenerate, i.e., 

(it, v) = 0 for all v <=$■ u = 0, 

(ii) it is symmetric, i.e., (u,v) = (v,u), 

(iii) it is positive definite, i.e., (u,v) > 0 whenever u ^ 0, and 

(iv) it is bilinear, i.e., (au + bv, w) = a(u, v) + b(v, w). 


I Remark. The set of four axioms above is a restatement of those presented in 
Sect. 4.1.3 for real vector spaces. 
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By definition, the inner product of v and w reads 

(v,w) = (v t e i ,w J e j ) = v l w 3 (e it ej), 

where (e t , ej) as well as (v, w ) are certain real numbers. Then, if we establish 
a matrix [g t j] with the entities 


9ij — ( e o 


(20.19) 


we have 


(v, w) = gijV l w J , 

which reproduces the previous notation (18.27) obtained via the coordinate- 
dependent treatment of tensors. 


Remark. The notation in (20.19) seems to imply that the function ( , ) is 
written in terms of the dual basis £ l £ V* by 

(,)=g ij £ i ®£ j . ( 20 . 20 ) 

However, this is not the case because (20.20) does not satisfy the symmetric 
property required by (ii) in the above definition of an inner product. 


20.3.5 Index Lowering and Raising in Tensor Language 

We demonstrate below another important consequence of the notation in 
(20.19). Since the inner product ( v , w ) is a bilinear function with variables v 
and w, it is a linear function of w if we fix v. Assume a function v : V — » R 
defined by 

i/(w) = (v, w). 

Clearly, v is a linear function of w and v £ V* . Hence, v can be expanded by 
the dual basis e- 7 £ V* , which results in 

v = VjE 3 = v(ej)e J 
= (v,ej)e J = (v l ei,ej)£ J 
= v l gij£ J . 

This indicates that the components Vj of the linear function v are given by 

Vj = g ijV \ 


Denoting Vj by Vj, we obtain 


Vj = gijV\ 


( 20 . 21 ) 
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which is identified with the index lowering of v l by the use of g^ . Emphasis is 
placed on the fact that the result (20.21) gives a one-to-one relation between 
v € V and v € V* via the entities = (e,;, ej). That is, going from a vector 
v £ V to its unique image v £ V* is achieved by simply lowering the index of 
the contravariant component of v through relation (20.21). 

The counterpart of (20.21), index raising, is obtained by noting the fact 
that by hypothesis the matrix [gij] is nondegenerate. This implies the existence 
of the inverse matrix denoted by [g 1 -’]. Multiplying the elements g k J by both 
sides of (20.21) yields 

g kj Vj = g kj g,j i' ; = g kj g j iV i = 5 = v k , 

i.e., 

v l — g^Vj. 

We have thus shown that the introduction of the matrix [gij] composed of 
the real numbers g t j defined by (20.19) provides a bridge between the two 
apparently different viewpoints (those in Chaps. 18, 19 and in Chap. 20) 
regarding tensors. 



Part VII 


Appendixes 




A 


Proof of the Bolzano— Weierstrass Theorem 


A.l Limit Points 

In this appendix we prove the Bolzano— Weierstrass theorem, first intro- 
duced in Sect. 2.2.2, which guarantees the existence of a limit point in some 
sets of real numbers. For a better understanding, we begin with a brief review 
of the basic properties of limit points. 

Below is we repeat the definition of a limit point from Sect. 1.1.5. 

4 Limit point: 

A point x £ R is called a limit point of a set SCR if every neighbor- 
hood V of x contains an element different from x. 

We denote by S the set of limit points of S. A point in S that is not a limit 
point of S is called an isolated point of S. A limit point is often referred to 
as a cluster point or accumulation point. 

Observe that x £ S if and only if every neighborhood of x contains an 
infinite number of points of S. This is so because if a neighborhood 

V = (x — 5, x + 8) 

of a limit point x contains only a finite number of points, say ai, a 2 , • • • , a n , 
where a* yf x, then there is a positive number e such that 

e= min Idi — x\. 

1 <i<n 

Since x is a limit point of S, there is a point a £ S such that a yf x and 
\x — a | < e. This means that a £ V but a yf a, for any i, which contradicts the 
assumption that V contains only n points of S. The implication in the other 
direction is obvious. 

A finite set cannot have a limit point, since any neighborhood of a limit 
point must contain an infinite number of elements of the set. On the other 
hand, an infinite set may or may not have a limit point. 
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A. 2 Cantor Theorem 

The previous discussion raises the question: When does a set possess a limit 
point? The following theorem serves as a lemma to answer this question. 
Meanwhile we denote by t(T) = b—a the length of any closed interval / = [a, b] 
with a < b. 


4 Cantor theorem: 

Let (/„) be a sequence of nonempty, closed, and bounded intervals. If 
In+i C I n for every n £ AT, then the intersection H^Li A is not empty. 
Furthermore, if 

inf {£(/„) : n £ N} = 0, 
then D“i In is a single point. 


Proof Suppose I n = [a n ,b n ] and / = H^Li In- Using the nested property of 
the intervals /„, we have 

Tfl A p =£* Ijn — Ip ^ O'p — & m ^ b m A bp. (A.l) 

Clearly, the set S = {a n : n € N} C R is not empty and is bounded above 
by b\. Hence, the set S has a least upper bound, which we call x. If we can 
prove that x G /, we can conclude that / is not empty. In fact, this can be 
proven by observing that 


x £ Ik for all k £ N, 


i.e., 

a k <x< b k for all k £ N. (A. 2) 

First, it is obvious from the definition of x that a k < x for all k. Second, 
b k for arbitrary k satisfies a n < b k for all n £ TV, i.e., b k is an upper bound of 
S. In fact, if n < k then, by (A.l), a n < a k < b k ', and if n > k then, again by 
(A.l), a n <b n < b k - Finally, it follows that x < b k for all k £ N, since a; is a 
least upper bound of S, whereas b k is just an upper bound of S. Thus we can 
conclude that (A. 2) is true. 

Now we consider the second statement in the above theorem. Suppose that 
inf {£(/„) : n £ N} = 0, 

and let x, y £ I. It then follows that x,y £ I n for every n, which implies that 
\x — y\ < £(I n ) for all n £ N, 


so 


\x — y\ < inf{^(/„) : n £ N} = 0. 




A. 3 Bolzano- Weierstrass Theorem 659 


This means that x = y, i.e. , the interval 

oo 

'=fu 

n— 1 


is a single point. Jh 


A. 3 Bolzano— Weierstrass Theorem 

We are now ready to prove the Bolzano- Weierstrass theorem, which gives us 
sufficient conditions for the existence of a limit point in a set. 

Bolzano Weierstrass theorem: 

Every infinite and bounded subset of R has at least one limit point in R. 


Proof Let S be an infinite and bounded set of real numbers. Being bounded, 
S is contained in some bounded closed interval I 0 = [a 0 Ao]- First we bisect 
Iq into the two subintervals 


I'o = 


a 0 , 


a 0 


T" - 
! J 0 — 


a 0 


Since S C. (I' 0 U I'f) is infinite, at least one of the two sets Sfl/J and S fi I'f 
is infinite. Let Ji = [«i, &i] = Iq if S IT I' 0 is infinite; otherwise let I\ = We 
then have 


h C J 0 , i{h) 


bo — a o 


2 


Now we bisect I\ into the subintervals 


I[ = 


a i, 


Ui + b\ 


V - 

> ~ 


a\ + b\ 


M 


one of which necessarily intersects S in an infinite set. Let I2 = [a 2, 62] = I[ if 
S fi /{ is infinite; otherwise let I2 = I” ■ Continuing in this fashion, we obtain 
the intervals I t = [a*, 6j], 0 < i < n, which satisfy 


IiQIi-i, t(Ii) 


bo ~ a 0 


2 i 


and the fact that S fi Ii is infinite. We bisect I n again to obtain 


I' = 


I" = 

) 1 n 


Since I n = If IJ J" and S (T I n is infinite, either S fi V n is infinite wherein we 
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set I n+ 1 = [a ra+ i, b n+ i] = I' n , or S IT J" is infinite where I n+ \ = I” is chosen. 
Now we see that 


-^n+l — 1-m ^{In+1 ) 


bo — «o 
2 n+1 


l(/n) 

2 


and that S' l~l I n +i is infinite. By induction, therefore, it is show that there 
exists a sequence (/„) of nonempty, closed, and bounded intervals. In view of 
Cantor’s theorem, we see that the intersection f]^Li contains at least one 
single point x. We now complete our proof by showing that x £ S. 

Suppose that x € Pl^Li In- Given any e > 0, we can choose n € IV so that 


bo ~ a 0 


2 ” 


< e, 


or equivalently, 


\I n \ < £. 

This, together with the fact that x £ I n for all n, implies that 

I n C (x — s, x + e). 


Since I n contains an infinite number of elements of S, so does the neighbor- 
hood (x — £,x + £) of x; hence x £ S. A 



B 


Dirac <5 Function 


B.l Basic Properties 


In this appendix, we review the properties and various expressions of Dirac’s 
5 function. The first thing to be noted is that the S function is not a function 
at all. A function is a rule that assigns another number to each number in a 
set of numbers. However, the S function as used in physics is rather a short- 
hand notation for a complicated limiting process whose use greatly simplifies 
calculations. It takes on a meaning only when it appears under an integral 
sign, in which case it behaves as 



f(x)5(x)dx 


/( 0 ), 


For the special case of f(x) = 1. We have 

/ OO 

S(x)dx = 1. 

-OO 


(B.l) 


(B.2) 


If the singular point is located at an arbitrary point xq, then 



f(x)5(x 


Xq )dx 


f Oo)- 


(B.3) 


Except at the singular point Xq = 0, 


S(x) = 0. (B.4) 

Thus <5(cc) vanishes at all points where its argument is not zero, but at that 
one point it is undefined. Nevertheless its behavior near this point is all that 
matters. 
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Let S a (x) be a set of functions parametrized by the index a that has the 
properties 

lim 5 a (x) =0 for all x ^ 0, 

a —>0 

/ +oo 

f(x)5 a (x)dx = f( 0). (B.5) 

-oo 

In precise terms the original equations defining the 6 function must be inter- 
preted as standing for the limiting processes of (B.5). 


B.2 Representation as a Limit of Function 

In what follows, we look at several sets of functions that are endowed the 
properties described in (B.5). 


1. The limit of a box function 

The simplest example is the function 5 c (x) defined (for c > 0) by 

1/c for \x\ < c/2, 


S c (x) = 


0 for |cc| > c/2. 


(B.6) 


Clearly, lim c ^o 8 c (x) = 0 at all x ^ 0 and /_ S c (x)dx = 1 independent of c. 
In addition, we have 


/ OO 

f{x)S c (x)dx = /( 0), 

-oo 


(B.7) 


which can be shown formally for continuous functions f(x) as 

r°° pc/2 1 pc/ 2 

lim / f(x)6 c (x)dx = lim / f{x)8 c {x)dx = lim - / f(x)dx 

J —oo c—>0 J — c /2 c ^° CJ - c /2 

= lim [ 1 dx = lim /(fc). 

c^O C J-c/2 c ^° 

In the last line, the mean value theorem of integral calculus was employed 
with the definition — 1/2 < £ < 1/2. Letting c — > 0, we obtain (B.7). 


2. The limit of a Gaussian function 

The sequence of the Gaussian distribution function 


<L0c) 


D — X 2 / ° 
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provides another representation of the 6 function. Note that lim o ^ 0 d a (x) = 0 
at all i^O, f_m &a{x)dx = 1 independent of a, and lim a ^o f{x)S a (x ) 

dx = /( 0). The entire contribution to the integral, as a — » 0, comes from the 
neighborhood of x = 0. Therefore, we may write symbolically, 

<5(x) = lim S a ( x) = lim - e~ x ! a . 

cl — *0 a — *0 CLy/lT 

3. The limit of a Lorentzian function 

Another useful representation for the S function is 

S(x) = lim S £ (x) = lim - ^ , 

which the reader can work out as in the above example. 


4. The n — ► oo limit of a sequence of functions 


The final representation of the S function is slightly different from the preced- 
ing three and plays a central role in the proof of the Weierstrass theorem, as 
demonstrated in Appendix C. It is defined as 

f c n ( 1 — x 2 ) n for 0 < \x\ < 1 (n = 1, 2, 3 • • • ), 

Sn(x) = (B.8) 

{ 0 for |x| > 1, 

where the constant c n must be determined so that 



The functions S n (x ) form a sequence whose limit is a 6 function. This represen- 
tation of the S function differs from the others in that the defining parameter 
n increases to infinity, rather than decreasing continuously to zero. 


B.3 Remarks on Representation 4 


We show below that representation 4 above has the conditions (B.5) required 
for identification with Dirac’s 5 function. At first, we determine the normal- 
ization constant c n . From the hypothesis (B.9), we have 


1 

Cn 




x 2 ) n dx. 


(B.10) 
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Making the change of variable x = sin0, we obtain 


— = 2 
Cr), 


cos 2n+1 Ode = 


2 n+1 n\ 


t /2 


1 • 3 • 5 • • • (2 n + 1) : 


(B.ll) 


which becomes 

(2n + l)! 

° n ~ 2 2 "+i( n !)2- 

Next we consider the asymptotic behavior of c n as n 
(B.10) that 


(B.12) 

oo. It follows from 


i r 1 r 1 /V 1 n 

= 2/ (l-x 2 ) n dx>2 (1 -x 2 ) n dx, (B.13) 

c n Jo Jo 

since 1 / y/n for all n = 1 , 2 , • • • and the integrand is positive throughout [0, 1]. 
Now we consider the function 


g{x) = (1 - x 2 ) n - (1 - nx 2 ). 


Since g( 0) = 0 and 

g\x) = 2nx [l — (1 — x 2 ) 71 ^ 1 ] >0 for all 0 < x < 1, 

g{x) must be monotonically increasing in the interval [0, 1]. Therefore, g{x) > 
0, or equivalently, 

(1 - x 2 ) n > 1 - nx 2 

for all x in [0, 1]. Using this inequality in (B.13), we have 


1 fl/Vn 

— >21 (1 — nx 2 ) n dx 

c n J 0 


4 1 

3 y/n > y/n' 


i.e., 

c n < y/n. (B.14) 

This result implies that the limit n — > oo of the function S n (x) given in (B.8) 
equals zero for all x / 0. 

Finally, we examine the validity of the relation: 


lim 

n — kx> 


f(x)6 n (x)dx 


/( o). 


(B.15) 


To prove this, it suffices to verify that the contribution to the integral 
j / ! S n (x)dx comes increasingly from the neighborhood surrounding the origin 
as n — » oo. Note that for 0 < e < 1, 


8 n (x) dx 


5 n (x)dx , 


-l 


(B.16) 
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since S n ( x) is an even function of x. Now 

J 5 n (x)dx < y/n J (1— a : 2 ) n dx < \Zn(l — E 2 ) n (l—e) < y/n(l—E 2 ) n , (B.17) 

where we employ the fact that (1— a; 2 )™ in the interval [s, 1] takes its maximum 
at x = e. It is obvious that the decreasing behavior of the term (1 — e 2 ) n with 
n dominates increasing behavior of the term y/n, so that 

lim [ 5 n (x)dx = 0. (B.18) 

The results (B.16) and (B.18) justify the desired relation (B.15). 




c 


Proof of Weierstrass Approximation Theorem 


6 Weierstrass approximation theorem: 

If a function f(x) is continuous on the closed interval [a, 6], there exists 
a polynomial 

n 

G n (x) = J2 c kX k (C.l) 

k — 0 

that converges uniformly to f{x) on [a, b\. 

To prove this, we may assume without loss of generality that f(x ) is defined 
on [0, 1] and that /( 0) = /( 1) = 0. Outside the interval [0, 1], we may define 
f(x) to be identically zero. Then, the relevant polynomial (C.l) is given by 
the integral form as 


G n (x) 



f{x + t)S n (t)dt, 


0 < x < 1. 


(C.2) 


Here 6 n (t) is the sequence of the functions represented by 

' c„( 1 — t 2 ) n for — 1 < t < 1, 


Sn(t) = 


0 


for |f| > 1, 


where c„ is 


Cn — 


(2n + 1)! 
2 2 «+ 1 (n!) 2 


so 


that J S n (t)dt = 1. 


(In fact, the sequence S n (t) as n — > oo does have the properties characterizing 
a 1 5 function; see Appendix B). Since f(x) is assumed to vanish outside the 
interval [0, 1], (C.2) can be rewritten as 


/ 1 — X 

f(x + t)8 n (t)dt. 

-X 




668 C Proof of Weierstrass Approximation Theorem 


By a change of variable t — > t — x, we obtain 

G n (x) = f f(t)S n (t — x)dt = [ f(t)c n [l - (t - a;) 2 ] " dt. 

Jo Jo 

This last integral shows that G n {x) is a polynomial of degree 2 n in x. In what 
follows, we prove that the sequence of polynomials given by {Gi(x), C?2(a:), • • • } 
converges uniformly to f(x). 

Since f{x) is continuous on [0, 1], there exists an appropriate infinitesimal 
5 such that for a given £ > 0, 


\f(x + 6) - f(x) | < £ 

for all x in [0, 1]. Now, we use (C.2) for G n (x) to obtain the quantity 


I G n (x) - f(x ) | = 


[f(x + t) - f(x)] 5 n {t)dt 


< J \f(x + t) - f(x)\S n (t)dt, (C.3) 

where S n (t) > 0 on t £ [0, 1], If the last term in (C.3) vanishes as n — > oo, the 
proof of the theorem is complete. 

To show this, we break up the range of integration into three parts, 


/: 


-7 r 7 /*->- 

+ / + / 

1 J J -y 


where 7 is a certain infinitesimal number. Since f(x) is continuous on a closed 
interval, it is bounded there. Let the maximum value of |/(x)| = M. Then 
the last term of integrals becomes 

[ \f(x + t) - f(x)\S n (t)dt < ( \f(x + t)\5 n (t)dt+ f \f(x)\S n (t)dt 

J 'y J •y J 'y 

<2M f 5 n (t)dt <2M V / S(1 — 7 2 )", (C.4) 

J *7 


'7 

where we have used the inequality (B.17). Similar arguments yield 

r-7 
'-1 


j \f(x + t) - f(x)\ S n (t)dt < - q 2 )”. (C.5) 


Finally, the remaining integral, [2 , is estimated by using the continuity of 


f(x), which guarantees that for any £ we can find an infinitesimal 7 that 
satisfies 

\t\ < 7 => I f(x + t) - f{x ) | < £. 
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This yields 

/ \f(x + t) - f(x)\S n (t)dt <e f 6 n (t)dt < e, (C.6) 

J — *y J — 'y 

since S n (t)dt < 1. 

Collecting the results of (C.3)-(C.6), we have 

| G„(x) - f(x ) | < 4My/n{ 1 - y 2 )” + e. 

The value of y/n(l — 7 2 ) n for 0 < 7 < 1 can be set arbitrarily small for large 
enough n and, in particular, smaller than s. Therefore, there exists an N such 
that 

n > N => \G n (x) — f(x)\ < e 

for any arbitrarily small preassigned e where N does not depend on x. This 
means that the sequence of polynomials G n (x) converges uniformly to the 
continuous function fix') on [0,1], which completes the proof. We emphasize 
that the above discussion holds for an arbitrary continuous function on an 
arbitrary finite closed interval [a, 6], as was indicated at the outset. 

Remark. It should be emphasized that our initial hypothesis that /( 0) = 
/( 1) = 0 imposes no limitation on the validity of the proof. To see this, we 
now suppose that f{x) is defined on [a, b}. Then, the function g{x) defined by 

yields /(a) = g(0) and /(&) = g(l), where any x in the interval [a, 6] corre- 
sponds to z in [0, 1]. Furthermore, by introducing the function 

K z ) = g{z) - g(0) - z [5(1) - 5(0)] 

for 2 in [0, 1], we have h( 0) = 0 and ft.(l) = 0. We can show that the polynomial 
G n {x ) that approximates the original function /( x) also approximates the 
modified function h{z) by replacing the variable x in G n {x ) by 2. 
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Tabulated List of Orthonormal Polynomial 
Functions 


Hermite Polynomials H n (x) 

Orthogonality: 

/ OO 

e~ x2 H m (x)H n (x)dx = 2”n! v / 7r<5 mn . 

-OO 

Rodrigues formula: 


H n (x) = (-ire x -^(e-* 2 ). 


dx n 


Differential equation: 


d 2 d 

x) - 2x—H n (x) + 2nH n (x) = 0. 
dx z dx 

Recurrence formula: 

H n+ i(x) - 2 xH n (x) + 2nH„_ 1 (x) = 0. 
Generating functions: 

g(t,x) = e 2xt - t2 = Y ^l t n . 

i n\ 

n = 0 


Laguerre Polynomials K(x) 
Orthogonality: 


J™ x v e-*L v m {x)IS n {x)dx = r( ^^) 1) <5 ” 
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Rodrigues formula: 




x~ v _ d n 

e 

n! dx n 


(e~ x x v+n ) . 


Differential equation: 

x ^ 2 L n( x ) + (v+ 1 -x)-^L^(x) + nL v n (x) = 0. 
Recurrence formula: 


O + l)L" +1 (x) - (2 n + v+ 1 )L v n (x) - (n + is)L"_ 1 (x ) = 0. 


Generating functions: 


ff(t,x ) 


e -xt/(i-t) 


n=0 


Jacobi Polynomials Gn’ lx \ x) 

Orthogonality: 

I \l+*ni -xrct^ (x)G^ 


( x)dx = 


2»+ v+1 r(n + //+ 1 )r(n + i/ + 1 ) 
n!(2n + ^ + + l).T(n + 1 / + /x + 1) 1 


Rodrigues formula: 

G^\x) = \-4(l - *)— '(1 + x)-"— [(1 - xj^fl + x)'*+"] 
2 n n! atc n 

Differential equation: 


(l-x 2 ) 


d 2 

dx 2 


G. 


^’^(x) + \p, - 1/ - (u + ll + 2)x]- < —Gn’ IJ '\x) 

dx 

+ n(n + u + fi+ l)Gn’^ (x) = 0. 


Recurrence formula: 

Gq'’ ij '\x) = 1, G^’^ix) = ^{{u + fi + 2)x + (v 2 - /X 2 )}, 

2 (n + l)(n + n + /r + l)(2n + c + (x) 

— (2n + i/ + fi + 1) [(2n + v + /r)(2n + ^ + ^ + 2)x + — /.t 2 ] G^’^ (x) 

-2 (n + 0(n + t0( 2n + v M + 2)G^l^(x) = 0. 
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Generating functions: 


g{t,x) 


(1 - 2xt + t 2 ) 1/2 |l - t + (1 - 2xt + t 2 ) 1/2 y |l + t + (1 - 2 xt + t 2 ) 1/2 y 

oo 

£ G& ,ll \x)t n . 

n = 0 


Gegenbauer Polynomials C&x) 
Orthogonality: 

j\l-x 2 ) X ~ h C^x)C^x)dx 
Rodrigues formula: 


V* r(n + 2X)r (A - i) 
n!(n + \)r(2\)r(A) ' 


C X (x) 


(—l) n r(n + 2A)-T[A + |] A+ , d" 

2"n!r[n + A + ±]P(2A) 1 j cte 11 


(l_ a , 2 )n+A-i _ 


Differential equation: 

(! ~ X ^^2 C n < y X ) ~ ( 2X + 1 ) X ^ C n( X ) + n ( n + 2X ) C n( X ) = 

Recurrence formula: 

(n+l)C x +1 (x) - 2 (n+X)xC x (x) - (n+ 2A - lJC^Or) = 0. 
Generating functions: 


g(t,x) 


1 

(1 — 2 xt + t 2 ) X 


= Y J C x {x)t\ 

n = 0 


Legendre Polynomials P n {x) 
Orthogonality: 


f 1 2 

I Pm[x)Pn(.x)dx = — ■ —S mn . 

J - 1 2?i + 1 


Rodrigues formula: 




2 ra ?r! 
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Differential equation: 


(1 - x 2 )—zP n {x) - 2 x — P n {x) + n(n + 1 )P n {x) = 0. 


Recurrence formula: 


(n + l)P n+ i(x) - (2 n + 1 )xP n {x) + nP„_ x(x) = 0. 


Generating functions: 


9(t,x) = — W2 =J2 P n( x )t n . 

(1 - 2 xt + t 2 ) 7 


Chebyshev Polynomials of the First Kind T n (x) 


Orthogonality: 


rl T m {x)T n {x)^_ n x n t x x A 

/- dx — ^mn (1 H - ^mO^nO ) • 


Rodrigues formula: 


T M (-2) n n!V5F 2d d" r 2^-il 
Tn[x) - (2 n)l y d^l {1 ~ X > T 


Differential equation: 


(1 - x 2 )-^T n (x) - x-^T n (x) + n 2 T n ( x) = 0. 


Recurrence formula: 


T n + i{x) - 2xT n (x) + T„_ i(x) = 0. 


Generating functions: 


XI oo 


0(*> X ) = 1 _ 2xt + f 2 = ^ 2T «( ;r ) 7 " + W. 


Chebyshev Polynomials of the Second Kind U n {x) 
Orthogonality: 

J \/l - x 2 U m {x)U n (x)dx = ^ S mn (1 - (5 m0 (5„o) • 
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Rodrigues formula: 


U„{x) = 


(—2 ) n (n 
(2 n- 


h 1 )!a 

1 )! 




dx n 


( l-x 1 2 ) n+ s 


Differential equation: 

d 2 d 

(1 - x 2 )-—U n (x) - 3 x — U n (x) + n(n + 2)U n (x) = 0. 
ax z dx 

Recurrence formula: 

U n+ i(x) - 2xU n (x) + XJ n - i(x) = 0. 
Generating functions: 


g(t,x) 


1 

1 — 2xt + t 2 


5] U n (x)t n . 

n — 0 
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Abscissa of absolute convergence, 411, 
412 

Abscissa of convergence, 410-412, 415 
Absolute convergence, 32, 38, 40, 42, 
67, 214, 383, 411, 412, 424 
Absolute convergence of an infinite 
series, 33, 38, 40 

Absolute maximum theorem, 207 
Absolute minimum theorem, 208 
Absolutely divergent series, 42 
Accumulation point, 6, 657 
Active transformation, 570, 571 
Addition, 83, 640, 641, 643, 646 
Addition formula for analytic continua- 
tion, 254 

Addition identity, 83 
Addition of complex number, 74 
Addition of tensor, 586 
Addition of vector, 74 
Addition theorem of trigonometric 
function, 251 
Additive inverse, 83 
Adjoint operator, 502 
Admissibility condition, 450, 459 
Admissibility constant, 450 
Airfoil, 335, 336 
Aliasing, 394 

Almost everywhere, 156, 158, 160, 168, 
174, 456, 474 
Alternating sequence, 42 
Alternating series, 42 
Alternating series test, 42 
Amplitude modulation, 404 


Analytic continuation, 215, 216, 246, 
247, 249, 251-254, 284, 409, 420, 
421, 432 

Analytic continuations of each other, 
247, 248 

Analytic function, 102, 125, 186-188, 
191- 194, 198, 201, 202, 204-208, 
210, 213, 217, 220, 236, 237, 244, 
251, 254, 259, 263, 286, 289, 290, 
305, 306, 409, 432, 436, 438 
Analyticity, 188, 190, 192, 193, 195, 
261, 312, 321, 412, 432 
Analyticity at infinity, 313 
Angle-preserving, 305, 306 
Angular momentum, 583, 591, 596 
Angular velocity, 596 
Anharmonic ratio, 322 
Antisymmetric part of tensor, 590 
Antisymmetric tensor, 589-591 
Approximation coefficient, 472, 478-480 
Area under the graph, 145, 382 
Associated Legendre function, 112 
Associative, 83, 387, 389, 646 
Associativity, 83 

Asymptotically stable critical point, 528 
Auto-correlation function, 390 
Autonomous system, 528 
Axial vector, 582 

Banach space, 86, 87 

Banach’s fixed point theorem, 173 

Basis, 79, 87, 88, 612, 642, 646, 649, 650 

Basis of a Hilbert space, 91 

Basis of tensor space, 647, 648 



678 Index 


Bernoulli equation, 505, 506 
Bernoulli’s theorem, 230, 232 
Bessel equation, 505 
Bessel function, 385, 403 
Bessel inequality, 79, 82, 89, 96, 104, 
357 

Beta function, 108 

Bilateral Laplace transform, 433 

Bilinear, 643, 651 

Bilinear function, 644, 645, 648, 651, 
652 

Bilinear transformation, 316, 317, 321, 
324, 325 

Binomial theorem, 24 
Bit reversal, 398 
Bit-reversing process, 399, 401 
Blasius’ formula, 230-232 
Bolzano- Weierstrass’ theorem, 26, 27, 
80, 657, 659 
Boundary point, 7 
Bounded above, 3, 20, 21, 36-38, 43 
Bounded almost everywhere, 160 
Bounded below, 3, 20, 21 
Bounded closed interval, 5 
Bounded open interval, 5 
Bounded real sequence, 19-21, 26, 27 
Bounded set, 3 

Branch, 226, 241-244, 252, 272, 273, 
276, 421 

Branch at infinity, 313 

Branch cut, 226, 243, 244, 441, 442 

Branch line, 244 

Branch point, 235, 243, 244, 252, 282, 
441 

Brownian motion, 550, 551 

Cantor set, 155, 157 
Cantor’s theorem, 658 
Cardinal number, 155 
Carrier wave, 404-406 
Cartesian basis, 79 
Cartesian coordinate, 317, 319 
Cartesian coordinate system, 567, 568, 
576, 580, 582, 602, 603 
Cartesian product of vector spaces, 643 
Cartesian space, 78 
Cartesian tensor, 578, 589, 596, 601 
Cartesian tensor of the first order, 576, 
577 


Cartesian tensor of the fourth order, 
585, 600 

Cartesian tensor of the second order, 
578 

Cartesian tensor of the third order, 585 
Cartesian vector, 576-578, 582 
Cauchy criterion, 25-27, 31, 36, 38, 53, 
55, 59, 62, 64, 79 

Cauchy criterion for convergence, 31 
Cauchy criterion for uniform conver- 
gence, 53 

Cauchy inequality, 208 
Cauchy principal value, 294 
Cauchy problem, 552, 553 
Cauchy sequence, 25-28, 36, 54, 55, 79, 
80, 82, 86, 91-94, 170, 171, 174, 
175, 496 

Cauchy’s integral formula, 124, 205-207, 
213, 217, 432 

Cauchy’s test for improper integrals, 

68, 425, 428 

Cauchy’s theorem, 198-201, 205, 210, 
220, 262, 274, 291 

Cauchy-Riemann relation, 189, 194, 
308, 309, 332 

Causality requirement, 300 
Center, 533, 535 

Central limit theorem, 175-178, 180 
Characteristic curve of a PDE, 542 
Characteristic equation of a PDE, 542 
Characteristic function, 176, 178, 180 
Characteristic polynomial, 529 
Chebyshev polynomial of the First 
Kind, 674 

Chebyshev polynomial of the first kind, 
119, 129-131, 133, 135 
Chebyshev polynomial of the Second 
Kind, 674 

Chebyshev polynomial of the second 
kind, 119 

Chebyshev’s inequality, 157, 158 
Christoffel symbol, 621-625, 627, 628, 
630-632, 635 

Christoffel symbol of the first kind, 623 
Christoffel symbol of the second kind, 
623 

Circle of convergence, 214-217, 221, 
222, 246, 250, 253, 254, 256 
Circulation (of fluid flow), 229 
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Clairaut equation, 488 

Closed set, 7 

Closedness, 83, 430 

Closure, 5, 7 

Cluster point, 6, 657 

Cofactor, 524, 572-575 

Column-vector notation, 510 

Commutative, 83, 387, 389, 640, 646 

Complement, 2 

Complementary minor of an element of 
a matrix, 571 

Complementary set, 2, 7, 8, 149, 150, 
156, 157 

Complementary system, 514 
Complete, 73, 79, 80, 86-89, 91, 101, 
170, 173, 515 

Complete analytic function, 248 
Complete integral of an ODE, 487 
Complete orthonormal set of functions, 
73, 97, 98, 109, 463, 464 
Complete orthonormal set of poly- 
nomials, 101, 103-105, 117, 

119 

Complete orthonormal vectors, 89 
Completeness of wavelets, 462, 463 
Complex conjugate, 75, 533, 641 
Complex function, 185, 186 
Complex sphere, 311 
Complex vector space, 74-76, 83 
Component of a tensor, 565, 576, 578 
Conditional convergence, 32-35, 37, 42 
Conditional convergence of an improper 
integral, 67, 412 

Conditional convergence of an infinite 
series, 33 
Conformal, 305 

Conformal mapping, 306-308, 310-313, 
315-317, 321, 322, 324, 328, 330, 
331, 333-335 

Conjugate harmonic function, 192 
Conjugate linear, 75 
Conjugate space, 641 
Conservation law of current flow, 537 
Conservation law of momentum, 230, 
231, 373 

Conservation of a functional equation, 
250, 253 

Contact point, 5-9 

Continuity of complex function, 186 


Continuity theorem (for characteristic 
functions), 180 

Continuity theorem (of integrals), 67 
Continuous approximation, 472, 476 
Continuous function, 47, 50 
Continuous on the left, 48 
Continuous on the right, 48 
Contraction, 587-589, 591, 596, 598, 
618 

Contraction mapping, 173, 174 
Contraction mapping theorem, 173-175 
Contrapositive proof, 9 
Contravariant basis vector, 603, 604, 
606, 621 

Contravariant component of a tensor, 
608-610, 612, 613, 618, 619, 629 
Contravariant component of a vector, 
607, 611, 613, 617, 627, 628, 631, 
650, 653 

Contravariant degree of tensor, 645 
Contravariant local basis vector, 603 
Contravariant tensor, 646 
Contravariant vector, 646, 647 
Convergece test for alternating series, 
42 

Convergence almost everywhere, 156, 
171 

Convergence of a real sequence, 17 
Convergence of a sequence of vectors, 

80 

Convergence of an improper integral, 67 
Convergence of an infinite series, 30, 33 
Convergence of Laplace integral, 

408-411, 422, 424-427, 430-432, 
435 

Convergence test, 29, 38, 42 
Convolution, 387-389, 451, 453, 459, 
479 

Convolution integral, 447, 476 
Convolution theorem, 387 
Coordinate, 88, 567 
Coordinate axis, 567 
Coordinate transformation, 566, 567, 
570, 577, 580 
Corner, 50 

Correlation function, 388 
Countable set, 154, 155 
Covariant basis vector, 603-605, 608, 
613, 617, 619, 621 
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Covariant component of a tensor, 609, 
610, 613, 619, 632 

Covariant component of a vector, 607, 
608, 611, 613, 617, 627, 628, 631, 
635, 650 

Covariant constant, 633 
Covariant degree of tensor, 645 
Covariant derivative, 628-633 
Covariant differentiation, 634, 635 
Covariant local basis vector, 602, 603 
Covariant tensor, 646 
Covariant vector, 608, 646, 647 
Critical point of an autonomous system, 
527-534 

Critical point of conformal mapping, 
308, 310 

Cross ratio, 322, 325 
Cross-correlation function, 388-390 
Curvature tensor, 635 
Curvilinear coordinate system, 565, 
601-603, 605, 607, 611, 615, 617, 
621 
Cut, 244 

Cylindrical coordinate system, 612, 616, 
621, 624 

D’Alemberian, 552 
Damped harmonic oscillator, 383 
Damping time constant, 446 
Darboux’s inequality, 196, 209, 211 
Decomposition algorithm, 478, 479 
Decreasing sequence, 20 
Derivative (of a complex function), 186, 
187 

Derivative (of a real function), 48 
Determinant of a matrix, 571 
Difference, 2 

Differentiability (of a complex function) , 
186, 188 

Differentiability (of a real function), 48 
Diffusion constant, 551 
Diffusion equation, 371, 545, 550, 551, 
561, 562 

Diffusion operator, 546, 550 
Dilatation parameter, 451, 454 
Dilation equation, 468 
Dimension of a vector space, 88 
Dirac’s (5-function, 661-663, 667 


Direct product (of vector spaces), 643, 
645, 646 

Direct product (of vectors), 578 
Direct proof, 9 

Direct sum of vector spaces, 643 
Directed cosine, 568 
Direction field, 490, 494, 526 
Dirichlet boundary condition, 332-334, 
540, 556 

Dirichlet problem for the diffusion 
equation, 551 

Dirichlet problems for the Laplace 
equation, 548 
Dirichlet theorem, 360 
Dirichlet’s conditions for the Fourier 
series convergence, 341, 347 
Dirichlet’s function, 149, 155, 156, 172 
Dirichlet’s integral, 353, 358 
Dirichlet’s kernel, 354 
Dirichlet’s theorem, 341 
Discrete Fourier transform, 391-394, 
396, 398, 400, 401 
Discrete wavelet, 460, 462, 463 
Discrete wavelet transform, 460-462, 
467, 472, 476, 478 
Disjoint interval, 141, 146 
Disjoint set, 2, 144, 145 
Dispersion relation, 297-302 
Displacement vector, 600 
Distance, 84, 174, 639 
Distance function, 84, 85 
Distribution, 176-180 
Distributive, 83, 387, 389, 646 
Divergence (as a vector operation), 631 
Divergent sequence, 18, 19 
Divergent series, 32, 33, 35, 42 
Divergent test, 32 

Dominated convergence theorem, 158, 
160, 161, 165, 166 
Dual basis, 642, 652 
Dual space, 641, 642, 645, 646 
Dummy index, 566, 626, 628, 629 
Dyadic grid arrangement, 461 
Dyadic grid wavelet, 461 

Eigenenergy, 136 
Eigenfrequency, 557 
Eigenfunction, 136, 501, 504, 505, 557 
Eigenvalue, 501, 504, 505, 530-534, 544 
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Eigenvector, 530-534 
Einstein tensor, 636 
Einstein’s field equation, 635-637 
Elasticity theory, 600 
Elastisity theory, 585 
Electric conductivity, 598 
Electromagnetic field, 599 
Element, 1-5, 7, 74, 76, 83 
Elliptic class of PDEs, 544 
Elliptic coordinate, 319 
Elliptic coordinate system, 565 
Elliptic integral of the first kind, 326 
Empty set, 1, 141, 142 
Entire function, 191, 209, 313 
Enumerable, 154 
Equal, 2 

Equality almost everywhere, 156, 158, 
174, 175, 456, 474 
Equivalent, 10 

Essential singularity, 233, 235-240, 282 
Essential singularity at infinity, 313 
Euclidean space, 3, 74, 75, 515, 614, 
639, 640, 651 
Euler’s formula, 108 
Euler-Fourier formula, 340, 344 
Existence theorem, 491, 495, 498, 515 
Expected value of a random variable, 
143, 176 

Explicit solution of an ODE, 484 
Exponential order, 423-427, 431 
Extended definition of conformal 
mappings, 312 
Extended real number, 3 

False, 9 

Fast Fourier transform, 396, 397 
Fast Fourier transform (FFT), 396, 398, 
399, 401 

Fast orthogonal wave transform, 478 
Fast wavelet transform, 460, 477-480 
Father wavelet, 463, 470, 477 
Fejer’s integral, 353, 358 
Fejer’s theorem, 355, 360, 361 
Finite set, 1, 140, 154, 155, 657 
First shifting theorem, 415 
First-order Cartesian pseudotensor, 

582, 585 

First-order linear homogeneous ODE, 
484 


Fisrt-order linear homogeneous PDE, 
541 

Fixed point in L p , 174 
Flat Riemann space, 614 
Flat space, 634-636 
Fluid flow, 229 
Focus, 533 
Four potential, 599 
Four-current density, 600 
Four-vector, 76 
Four-velocity, 636 
Fourier coefficient, 95-98, 105 
Fourier cosine series, 344, 345, 350 
Fourier integral, 383 
Fourier integral representation, 378 
Fourier integral theorem, 379, 380 
Fourier series, 95, 96, 339-341, 360, 363, 
366, 377 

Fourier sine series, 344, 350 
Fourier transform, 378, 382, 390, 391, 
406, 435, 559 

Fourier transform in three dimension, 
384 

Fourier transform in two dimension, 385 
Fractional transformation, 316 
Fraunhofer diffraction, 401, 403 
Frequency modulation, 404 
Fresnel cosine integral, 279, 281 
Fresnel sine integral, 279, 281 
Fubini’s theorem, 162-164, 166, 167, 
173, 175, 176, 178, 180 
Fubini-Hobson-Tonelli theorem, 164 
Function element, 247 
Function of exponential order, 423-427, 
431 

Function space, 172, 173 
Fundamental matrix, 518, 519, 521 
Fundamental mixed tensor, 611 
Fundamental sequence, 25 
Fundamental system of solutions, 
516-523 

Fundamental tensor, 585 

G.l.b., 4 

Gamma function, 108, 254 
Gauss notation, 106 
Gegenbauer polynomial, 119, 673 
General analytic function, 248 
General relativity theory, 634, 636 
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General solution of a differential 

equation, 316, 372, 375, 487-489, 
522, 530, 531, 533, 534, 540, 542, 
545, 553, 554, 557, 559 
Generalized Fourier coefficient, 91, 95 
Generalized Fourier series, 95 
Generating function, 113, 114, 124-126, 
470 

Generating function of Chebyshev 

polynomials of the first kind, 674 
Generating function of Chebyshev 
polynomials of the second kind, 
675 

Generating function of Gegenbauer 
polynomials, 673 
Generating function of Hermite 
polynomials, 125, 671 
Generating function of Jacobi polyno- 
mials, 673 

Generating function of Laguerre 
polynomials, 125, 672 
Generating function of Legendre 

polynomials, 113, 114, 125, 674 
Generating function of the multiresolu- 
tion analysis, 470 
Geometric curvature, 634 
Gibbs phenomenon, 347, 365, 366 
Goursat’s formula, 206-208, 261, 265 
Gradient, 631 
Gradient of a scalar, 631 
Gradient of a vector, 580 
Gram-Schmidt orthogonalization 
method, 103, 105, 114, 505 
Greatest lower bound, 4, 411 
Green’s function, 558, 559 
Gutzmer’s theorem, 227 

Haar discrete wavelet, 462, 472, 473 
Haar wavelet, 450, 458, 467, 473 
Half-range Fourier series, 344, 347 
Harmonic function, 191, 195, 546, 549, 
550 

Harmonic series, 35, 36, 39 
Heat flow, 550 
Heat flux, 561 
Hermite equation, 502 
Hermite polynomial, 117, 125, 127, 135, 
671 

Hermitian operator, 503 


Hilbert space, 73, 74, 79-83, 87-92, 95, 
98, 352 

Hilbert space theory, 352 
Hilbert transforms pair, 295-298 
Holomorphic, 188 
Hooke’s law, 600 

Hyperbolic class of PDEs, 544, 546, 
552, 553 

Hyperharmonic series, 36 

Identically distributed, 176, 177, 179 
Identity vector, 74 
If and only if, 10 

Imaginary part of a complex function, 
185 

Implicit solution of an ODE, 485, 486, 
490 

Improper integral, 66-68, 302, 412, 420, 
422, 425-428, 430, 433 
Improper node, 530, 531 
Improper rotation, 580-585 
Incomplete inner product space, 81 
Incompressible, 228 
Increasing sequence, 19 
Independent random variable, 176, 177 
Index lowering, 652 
Index raising, 652 
Inertia tensor, 596-598 
Infimum, 4, 8, 140 

Infinite series, 29-33, 37, 38, 40, 42, 96, 
105, 109, 221, 339 

Infinite series of functions, 62-64, 227, 
281, 340, 342, 362, 496 
Infinite set, 1, 154, 155, 157, 657 
Initial value problem, 419, 491-493, 
495, 497-499, 510, 513, 527, 552, 
554, 555 

Inner measure, 150, 157 
Inner product, 73, 75, 76, 78, 80, 87, 89, 
96, 97, 352, 588, 639, 641, 651 
Inner product (in tensor calculus), 651, 
652 

Inner product notation, 502, 503 
Inner product space, 75-82, 86, 87, 640, 
651 

Integral curve, 488-490 
Integral equation, 492 
Integral function, 313 
Integral of PDE, 540 
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Interior point, 7 

Intersection, 2, 87, 247, 658, 660 
Interval, 4 
Invariant, 609 
Invariant tensor, 585 
Inverse Fourier transform, 378, 379, 
384, 387, 395, 396, 406, 471 
Inverse Fourier transformation, 435 
Inverse Laplace transform, 408, 409, 
432, 434, 436, 439-441, 444, 446, 
448, 558 

Inverse matrix, 523, 574, 620, 653 
Inverse of discrete Fourier transform, 
392, 393 

Inverse of the two-sided Laplace 
transform, 434, 435 

Inverse wavelet transform, 456-458, 460 
Inversion (as a bilinear transformation), 
321, 327-329 

Inversion (as an improper rotation), 
581, 582 

Irrotational, 228, 231 
Isolated point, 6-8, 95, 149, 212, 657 
Isolated singularity, 233-236, 239, 252, 
262, 263, 313 
Isomorphism, 98, 649 
Isomorphism between f 2 and L 2 , 98, 99 
Isotropic tensor, 585, 600 

Jacobi matrix, 122 
Jacobi polynomial, 118, 672 
Jacobian determinant, 309, 388, 555, 
617 

Jordan’s lemma, 270, 438, 441 
Joukowsky airfoil, 336 
Joukowsky transformation, 335, 336 

Kinetic energy, 597 
Kramers-Kronig relations, 299 
Kronecker’s delta, 78, 610 
Kutta-Joukowski’s theorem, 228-231 

L’Hopital’s rule, 12, 13, 239, 280, 282, 
370, 417, 423 
L.u.b., 3 

Laguerre polynomial, 117, 118, 125, 671 
Lame constants, 600 
Langevin’s function, 283 
Lapalce transform, 408 


Laplace equation, 191, 192, 228, 
331-334, 545, 546, 548-550 
Laplace integral, 408, 410, 411, 421, 
422, 428, 429, 432, 433 
Laplace operator, 546 
Laplace transform, 407-409, 412, 

414-418, 420, 422, 432, 444-447, 
558, 559 

Laplace transform of derivative, 419 
Laplace transform of integral, 420 
Laplacian, 545, 546, 548-550, 559, 630, 
631 

Laurent series expansion, 219-224, 226, 
233-235, 238, 239, 260, 265, 266 
Least upper bound, 3, 658 
Lebesgue convergence theorem, 173, 
175, 178, 180, 181 

Lebesgue integrable, 152, 153, 161, 173, 
174 

Lebesgue integral, 139, 141, 144, 147, 
149, 151-155, 158, 161, 162, 167, 
172, 175, 176 

Lebesgue measurable function, 167, 170 
Lebesgue measurable set, 157 
Lebesgue measure, 149-151, 154-156, 
176 

Lebesgue sum, 151 153 
Left-hand limit, 46, 48, 408 
Left-handed coordinate system, 568, 
582 

Legendre polynomial, 105-107, 109, 

112- 114, 119, 125, 127, 136, 137, 
673 

Legendre’s equation, 501 
Levi-Civita symbol, 584, 588 
Lift, 336 
Lift force, 228 
Limit, 18 

Limit cycle, 536, 538 

Limit inferior, 21, 22, 140 

Limit of a function, 45 

Limit point, 6, 18, 657 

Limit superior, 21, 22, 40-42, 140 

Limit test for convergence, 38, 39, 43 

Limit test for divergence, 39, 40, 43 

Line element, 490 

Linear autonomous system, 528 

Linear differential equations, 407 

Linear function, 640 
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n-linear function, 644 
Linear homogeneous ODE, 484, 500 
Linear homogeneous PDE, 541 
Linear homogeneous system of ODEs, 
514, 516, 517, 524, 528 
Linear independence, 76, 79, 82, 88, 
103, 375, 515-517, 519, 520, 522, 
523, 569, 642 

Linear inhomogeneous ODE, 505, 506 
Linear inhomogeneous PDE, 541 
Linear inhomogeneous system of ODEs, 
514, 516 

Linear mapping (of vector spaces), 640, 
650, 651 

Linear ODE, 484 

Linear partial differential equation 
(PDE), 540 
Linear space, 640 

Linear transformation, 315, 316, 514, 
543, 544 

Liouville’s formula, 518, 521, 524 
Liouville’s theorem, 209, 238, 313 
Lipschitz condition, 174, 495, 497, 500, 
512, 515, 526 

Lipschitz constant, 495, 512 
Local basis vectors, 602 
Localization theorem, 368 
Logarithmic residue, 286, 287 
Logistic equation, 506 
Lower bound, 3 
Lower limit, 21 

Lower Riemann-Darboux integral, 140 

Mobius transformation, 316, 322 
Magnetic susceptibility, 598 
Maximum, 4 

Maxwell equation, 599, 637 
Maxwell-Boltzmann distribution, 177 
Mean convergence, 95, 97, 105, 351-353, 
355-357, 360, 361 

Mean value of a random variable, 143, 
144, 176 

Mean value theorem, 58, 204, 560, 562, 
662 

Measurable set, 151, 157, 160, 164, 165, 
167 

Measure, 141 

a-measure, 141, 144, 146-148 
Message wave, 404-406 


Method of inversion, 327 
Method of variation of constant 
parameters, 522 
Metric coefficient, 616 
Metric space, 84, 85 
Metric tensor, 611-614, 617-619, 621, 
623, 624, 626, 630, 631, 633-635, 
637 

Metric vector space, 84 

Mexican hat wavelet, 450-452, 455, 467 

Minimax property, 129 

Minimum, 4 

Minkowski’s inequality, 93, 169-171, 
175 

Minor of elements of a matrix, 571 
Mixed component of a tensor, 609-612, 
618 

Modified summation convention, 603, 
605, 606 

Moment of inertia, 597, 609 
Monotone convergence theorem, 
158-161, 165, 166, 171 
Monotonic sequence, 20 
Monotonically decreasing sequence, 20 
Monotonically increasing sequence, 19 
Morera’s theorem, 210 
Mother wavelet, 467-470, 477 
Multilinear function, 644-646 
Multiplication of complex number, 74 
Multiply connected region, 200 
Multipole, 136 

Multiresolution algorithm, 478 
Multiresolution analysis, 463, 464, 
466-470, 474, 477 

Multiresolution analysis equation, 468 
Multiresolution representation, 472 
Multivalued function, 226, 240-244, 
252, 318, 409, 420, 441 

Natural boundary, 249, 253, 256 
Natural isomorphism, 649-651 
Natural pairing, 643 
Necessary and sufficient condition, 10 
Necessary condition, 9 
Neighborhood, 5-8, 10, 12, 18, 24, 66, 
67, 187, 188, 218, 233-236, 239, 
244, 250, 260, 286, 308, 312, 313, 
316, 369, 528, 657, 660 
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Neumann boundary condition, 332, 540, 
556 

Neutrally stable critical point, 528 
Newtonian field of gravity, 636 
Noise reduction, 457, 458 
Non-degenerate, 651 
Non-isolated singularity, 235 
Nonhomogeneous linear partial 
differential equation, 541 
Nonlinear differential equation, 484, 
505, 506, 522, 538, 637 
Nonoverlapping sets, 141 
Norm, 76, 80, 85-87, 89, 91, 94-96, 174, 
352, 473, 510, 639 
p-norm, 85-87, 168, 175 
Normal distribution, 177-180 
Normed space, 85-87 
Null measure, 155, 157 
Nyquist critical frequency, 393 

Once-subtracted dispersion relation, 

300 

One-sided derivative, 49 
One-sided limit, 46 
Open set, 7 

Order of differential equation, 483 
Order of zero of function, 233 
Ordinary differential equation (ODE), 
174, 483 

Orthogonal basis, 88 
Orthogonal complement, 465 
Orthogonal curvilinear coordinate, 317 
Orthogonal decomposition, 466 
Orthogonal polynomial, 114-117, 119, 
121- 124, 126, 128, 129 
Orthogonal relation, 579 
Orthogonality, 73, 78, 79, 82, 88-90, 
103, 105, 107, 127 

Orthogonality relation, 115, 119, 120, 
129, 133 

Orthonormal basis, 73, 78, 88, 463, 464, 
466-471 

Orthonormality, 78, 101 
Orthonormality of wavelets, 462, 463 
Outer measure, 150, 156 
Outer product, 578, 588, 595, 610 

Parabolic class of PDEs, 544 
Parallelogram law, 78, 87 


Parseval’s identity, 90, 97, 98, 104, 302, 
356, 357, 362, 383, 390 
Parseval’s identity (for wavelet 
transform), 457, 460, 474 
Partial differential equation (PDE), 
371, 539 

Partial sum, 30, 31, 35-38, 43, 64, 99, 
104, 109, 256, 345, 346, 353, 358, 
363, 365, 366, 368, 369, 496, 500 
Particular solution of differential 

equation, 487, 506, 522, 523, 553 
Partition, 140, 152 
Passive transformation, 570, 571 
Path independence, 198 
Permutation symbol, 584 
Phase space, 527, 538 
Picard’s method, 491, 492, 497, 499 
Piecewise continuous function, 48, 50, 
362-364, 385, 417 

Piecewise smooth function, 50, 360, 
363, 364, 379, 380, 386 
Plancherel’s identity, 390 
Point, 1 

Point at infinity, 237, 238, 244, 312, 
313, 320, 322, 329 

Point of equilibrium of an autonomous 
system, 527 

Pointwise convergence, 51, 52, 54, 60, 
62, 95, 158, 173, 360, 363, 364 
Pointwise limit, 51 
Poisson’s equation, 547, 636 
Poisson’s integral formula, 213 
Polar coordinate system, 108, 194, 244, 
280, 310, 315, 317-319, 323, 403, 
549, 565 

Polar vector, 582 
Pole, 233-238, 240 
Pole at infinity, 313 
Positive definite, 651 
Potential field, 114, 136, 137, 228, 229, 
232, 333-335, 583, 599, 636 
Power spectrum, 383, 390, 405, 406 
Pre-Hilbert space, 86, 87 
Primitive integral of an ODE, 487 
Principal part in the Laurent series, 
222, 234, 235, 238 
Principal value integral, 68, 206, 
293-296, 300 

Probability, 143, 176, 177 
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Probability density, 136, 176, 177 
Probability density function, 143 
Probability distribution function, 143, 
144 

Product of inertia, 597 
Proof by contradiction, 9 
Proper node, 532, 533 
Proper rotation, 581, 582, 584, 585 
Proper subset, 2 
Pseudotensor, 580, 582 
Pseudovector, 582, 583, 590 
Pyramid algorithm, 478 
Pythagorean formula, 73 

Quantum mechanics theory, 135 
Quotient law, 592 

Radius of convergence, 102, 214-219, 
223, 245, 246, 252, 256, 257 
Random variable, 143, 144, 176, 177, 
179, 180 

Range convention, 566, 568 
Ratio method, 263, 266, 267, 278 
Ratio test for convergence, 40, 44 
Rational function, 132, 237, 238, 267, 
268, 271, 273 

Real part of a complex function, 185 
Real vector space, 83 
Reality condition, 298 
Rearrangement, 34, 35, 38 
Reconstruction algorithm, 478-480 
Rectangular Cartesian coordinate 
system, 565, 567, 570, 606 
Recurrence formula for orthogonal 
polynomials, 119-121, 125-127, 
129, 671-675 

Recurrence relation (for analytic 
continuation), 254 
Recurrence relation (of gamma 
functions), 255 

Recurrence relation (of scaling 
functions), 468 
Reduced system, 514 
Refinement equation, 468 
Reflection, 581-583 
Region of analyticity, 206-208, 250 
Region of convergence of the Laplace 
integral, 410, 425, 427, 430-435, 
444 


Region of the existence, 249 
Regular, 188 
Regular analytic, 188 
Regular part in the Laurent series, 222 
Regular point, 527 
Removable singularity, 233, 234, 236 
Residue, 259, 260, 263-268, 272-274, 
281, 282, 285, 286 

Residue theorem, 259, 260, 263, 267, 
269, 274, 277-279, 281, 286, 437 
Riccati equation, 506 
Ricci curvature, 637 
Ricci scalar, 636 
Ricci tensor, 636 
Ricci’s theorem, 630, 633, 634 
Riemann integrable, 140 
Riemann integrable function, 158, 172 
Riemann integral, 139, 140, 144, 152, 
153, 158, 172, 175 
Riemann space, 614 
Riemann sphere, 311-313 
Riemann sum, 144, 153 
Riemann surface, 241-243, 421, 422, 
441 

Riemann tensor, 635, 636 
Riemann’s theorem, 35 
Riemann-Darboux integral, 140 
Riemann-Lebesgue theorem, 358, 364, 
369 

Riemann-Stieltjes integral, 144 
Riemann-zeta function, 284 
Riesz-Fisher’s theorem, 96, 98 
Right-hand limit, 46, 48, 408 
Right-handed coordinate system, 567 
Rigid rotation of coordinate axes, 
567-570, 575, 576, 580, 583, 585 
Rodrigues formula, 106, 107, 114-119, 
121, 123, 124, 127, 129, 671-675 
Root test for convergence, 41, 44, 253 
Rotation (as a vector operation), 632 
Rotation (of fluid flow), 229 
Rotation with reflection, 581 
Rouche’s theorem, 210, 290-292 

Saddle point, 531, 532 
Sampling theorem, 394 
Scalar, 83, 579, 609, 646 
Scalar curvature, 636, 637 
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Scalar multiplication, 74, 83, 640, 641, 
643 

Scalar product, 75, 579, 604, 607, 611, 
613, 617, 618 

Scale factor, 308, 310, 321, 616, 617, 634 
Scale-dependent thresholding, 458 
Scaling function, 463, 464, 467-471, 
474, 476, 479 

Scaling function coefficient, 468-470, 
474, 477 

Scaling function space, 467 
Schrodinger equation, 136 
Schwarz differential equation, 316 
Schwarz Lemma, 211 
Schwarz principle of reflection, 254, 257 
Schwarz’s inequality, 76, 77, 89 
Schwarz-Christoffel transformation, 
325-327, 329, 332, 333 
Second shifting theorem, 416 
Second-order Cartesian pseudotensor, 
584 

Second-order linear homogeneous PDE, 
543 

Secular equation, 529 
Self-adjoint operator, 503 
Semiclosed interval, 5 
Sequence of partial sum, 30, 31, 43, 104, 
109 

Sequence of the remainder, 30 
Set, 1 

Shuffled sequence, 28, 37 
Signal approximation, 472 
Signal detail, 472 
Simple Laplace development, 572 
Simple set, 146 
Simple statement, 9 
Simply connected region, 199 
Single-valued function, 84, 240, 241, 
243, 247, 248, 288-290, 408, 414, 
421, 484, 510, 614 
Singular line, 253 
Singular point of an autonomous 
system, 527 

Singular solution of ODE, 488, 489 
Singularity, 188, 233 
Skew-symmetric, 589 
Smooth function, 50 
Solution of ODE, 484 
Solution of PDE, 540 


L 2 space, 81, 87, 92, 95, 98, 99, 352 
L p space, 86, 87 

i 2 space, 80, 87, 91, 92, 95, 96, 98, 99 
£ p space, 86, 87 
Specific heat, 562 

Spectrum of Sturm-Liouville system, 
501 

Spherical coordinate system, 384, 549, 
550, 616 

Spherical harmonic function, 109, 111, 
112 

Spiral point, 533, 534 
Square-integrable function, 81, 92, 94, 
95, 98, 101, 352, 357, 358 
Stability of critical point, 527, 528 
Stable critical point, 528 
Step function, 81, 144- 148, 365, 366, 
416, 417, 436, 440, 445 
Strain tensor, 600 
Stream function, 228 
Stress tensor, 600 
Strictly decreasing sequence, 20 
Strictly increasing sequence, 19 
Strictly stable critical point, 528 
Sturm-Liouville equation, 500-502, 505 
Sturm-Liouville operator, 500, 503, 504 
Sturm-Liouville system, 501, 504 
Subinterval, 5 
Subset, 1 

Subtraction of tensor, 586 
Successive approximation, 492, 493, 
495, 499 

Sufficient condition, 9 
Sum of infinite series, 30 
Sum of infinite series of functions, 62 
a-summable, 146, 147 
Summation convention, 566, 568, 602 
Support, 144 146 
Supremum, 3, 4, 8, 11, 140 
Symmetric Cartesian tensor of the 
second order , 597 
Symmetric part of tensor, 590 
Symmetric tensor, 589 

Taylor series expansion, 49, 102, 212, 
217-219, 222-225, 239, 246, 254, 
260, 263, 265, 266, 284, 294, 500, 
528 

Tensor, 565, 645 
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Tensor of the first order, 609 
Tensor of the second order, 609 
Tensor of zero order, 579, 609 
Tensor product, 644-647 
Tensor space, 646-648, 650 
Thermal conductivity, 372, 561 
Total variation of argument, 288-290 
Trajectory, 526-528, 530 
Transfinite number, 155 
Translation parameter, 451, 453 
Tree algorithm, 478 
Triangle inequality, 77, 92, 169 
Trigonometric Fourier series, 109, 339, 
340 

Trigonometric series, 339, 340, 355, 360 
True, 9 

Tunnel diode, 536 

Two-scale relation, 468, 469, 477, 478 
Two-sided Laplace integral, 433, 434 
Two-sided Laplace transform, 433-435, 
444 

Unbounded open interval, 5 

Unbounded set, 3 

Uniform convergence (of complex- 

function sequence), 213, 214, 217, 
227 

Uniform convergence (of Fourier series), 
341, 342, 352, 355, 359-362, 366 
Uniform convergence (of improper 
integral), 67, 68, 380 
Uniform convergence (of Laplace 
integrals), 425-427, 429-432 
Uniform convergence (of polynomial 
sequence), 101, 102, 104 
Uniform convergence (of real-function 
sequence), 52-68, 95, 158, 172, 497 
Union, 2 

Uniqueness of the integral, 198 
Uniqueness theorem (for analytic 
continuation), 250, 252, 254 
Uniqueness theorem (for characteristic 
function), 176, 179 
Uniqueness theorem (for solution of 
ODE), 136, 491, 497, 498, 515, 
516, 526 

Uniqueness theorem of the Dirich- 
let problem (for the diffusion 
equation), 552 


Uniqueness theorem of the Dirich- 
let problem (for the Laplace 
equation), 548 
Unit scalar, 83 
Unitary space, 75 

Universal gravitational constant, 636 

Universal set, 2 

Unstable critical point, 528 

Upper bound, 3 

Upper limit, 21 

Upper Riemann-Darboux integral, 140 

Van der Pol equation, 538 
Vanishing order, 11 
Variation of argument, 288 
Vector, 74 

Vector space, 73, 74, 83, 640 
Velocity potential, 228 

Wave equation, 373, 545, 552, 555, 556, 
558, 559 

Wave operator, 546, 552, 553 

Wavelet, 449 

Wavelet analysis, 449 

Wavelet coefficient, 469, 472, 477, 478 

Wavelet space, 467 

Wavelet transform, 451-456, 458, 460 

Weierstrass approximation theorem, 

101 

Weierstrass’ M test, 63 
Weierstrass’ test for improper integral, 
68, 426 

Weight function, 75, 115, 452, 500, 502, 
505 

Wiener-Kinchin’s theorem, 389, 390 
Wigner-Seitz cell, 348 
Windin g number, 262, 263, 291 
Wronskian, 518, 521 
Wronsky determinant, 518, 521 

Zero of function, 105, 122, 129, 130, 
132, 133, 207, 209, 210, 212, 264, 
286, 289, 367, 404 
Zero scalar, 83 
Zero vector, 78, 83, 89, 90 
Zeros of function, 264 
Zeta function, 36 



